*Building Large Language Models for Business Applications*
Large Language Models (LLMs) is an advanced type of language model that represent a breakthrough in the field of natural language processing (NLP). These models are designed to understand and generate human-like text by leveraging the power of deep learning algorithms and massive amounts of data.
If you've ever chatted with a virtual assistant or interacted with an AI customer service agent, you might have interacted with a large language model without even realizing it. These models have a wide range of applications, from chatbots to language translation to content creation.
Some of the most impressive large language models are developed by OpenAI. Their GPT-3 model, for example, has over 175 billion parameters%20and%20Turing%20NLG.) and is able to perform tasks like summarization, question-answering, and even creative writing.
How a Large Language Model was Built?
The architecture of LLMs is based on the Transformer model, which has revolutionized NLP tasks. The Transformer model utilizes a self-attention mechanism that allows the model to focus on different parts of the input sequence, capturing dependencies and relationships between words more effectively. This architecture enables LLMs to generate coherent and contextually relevant responses, making them valuable tools for a wide range of applications.
A large-scale transformer model known as a “large language model” is typically too massive to run on a single computer and is, therefore, provided as a service over an API or web interface. These models are trained on vast amounts of text data from sources such as books, articles, websites, and numerous other forms of written content. By analyzing the statistical relationships between words, phrases, and sentences through this training process, the models can generate coherent and contextually relevant responses to prompts or queries.
ChatGPT’s GPT-3 model, for instance, was trained on massive amounts of internet text data, giving it the ability to understand various languages and possess knowledge of diverse topics. As a result, it can produce text in multiple styles. While its capabilities may seem impressive, including translation, text summarization, and question-answering, they are not surprising, given that these functions operate using special “grammars” that match up with prompts.
The Transformer is a type of deep learning architecture that has revolutionized the field of natural language processing. It was introduced in the paper "Attention Is All You Need" by Vaswani et al. (2017). The Transformer model employs self-attention mechanisms to capture dependencies between words in a sentence, enabling it to learn contextual relationships and generate coherent and contextually relevant text.
The Transformer architecture excels at handling text data which is inherently sequential. They take a text sequence as input and produce another text sequence as output. eg. to translate an input French sentence to English.

The Transformer architecture consists of two main components: the encoder and the decoder. The encoder processes the input sequence and generates a representation, which is then passed to the decoder. The decoder generates the output sequence based on the encoder's representation and previous outputs.

Here are the key components and concepts of the Transformer architecture:


Pre-training and fine-tuning are two key steps in the training process of language models, including Large Language Models (LLMs).
Pre-training: In the pre-training phase, a language model is trained on a large corpus of unlabeled text data. During this phase, the model learns to predict missing words in sentences based on the surrounding context. It develops an understanding of language patterns, grammar, and contextual relationships. The pre-training process typically involves techniques like masked language modeling, where certain words are randomly masked and the model learns to predict them based on the remaining context.
Fine-tuning: After pre-training, the language model is fine-tuned on specific labeled datasets for specific downstream tasks. Fine-tuning involves training the pre-trained model on labeled data related to a particular task, such as question answering, sentiment analysis, or text classification. This process allows the model to adapt to the specific task by learning task-specific patterns and improving its performance. Fine-tuning is performed on a smaller dataset, which is typically task-specific and labeled by human experts.
In the context of Large Language Models (LLMs), the terms "pre-trained" and "fine-tuned" refer to two stages in the model development process. This two-step process offers several advantages:
LLMs, such as GPT-3, GPT-2, and BERT, are examples of large-scale language models that have undergone extensive pre-training and fine-tuning processes. They have been trained on vast amounts of text data and have a large number of parameters. This pre-training and fine-tuning approach allows LLMs to capture complex language patterns, generate coherent text, and perform well on a wide range of natural language processing tasks.
Popular Large Language Models (LLMs) are advanced models that have gained significant attention in the field of natural language processing. They have been trained on massive amounts of text data and have a large number of parameters, allowing them to capture complex language patterns and generate coherent text.
Here are some examples of popular LLMs:
GPT-3 (Generative Pre-trained Transformer 3): GPT-3 is a state-of-the-art language model developed by OpenAI. It is renowned for its impressive size, consisting of 175 billion parameters. GPT-3 has been trained on a vast amount of internet text data, enabling it to understand and generate human-like text. It can perform a wide range of natural language processing tasks, including language translation, text completion, sentiment analysis, and more. GPT-3 has shown remarkable capabilities in generating coherent and contextually relevant responses, making it a powerful tool for various applications.
GPT-2 (Generative Pre-trained Transformer 2): GPT-2 is the predecessor to GPT-3, also developed by OpenAI. Although smaller in size with 1.5 billion parameters, GPT-2 still delivers impressive language generation capabilities. It has been trained on diverse internet text sources, allowing it to produce high-quality text in a variety of styles and topics. GPT-2 is widely used for tasks such as text completion, text generation, and language understanding.
BERT (Bidirectional Encoder Representations from Transformers): BERT is a groundbreaking language model developed by Google. It introduced the concept of bidirectional training, which significantly improved the understanding of context in natural language processing. BERT has been trained on large-scale text data and employs a transformer architecture. It excels in various language understanding tasks, including question-answering, sentiment analysis, named entity recognition, and more. BERT has set new benchmarks in several natural language processing tasks and has been widely adopted in both research and industry.
Large language models like GPT-3, GPT-2, and BERT exhibit impressive capabilities in tasks such as :
They can understand complex language structures, generate coherent text, and perform well on a range of natural language processing tasks.
However, it's essential to acknowledge the limitations of LLMs:
LangChain is a framework for developing applications powered by language models that refers to the integration of multiple language models and APIs to create a powerful and flexible language processing pipeline. It involves connecting different language models, such as OpenAI's GPT-3 or GPT-2, with other tools and APIs to enhance their functionality and address specific business needs.
The LangChain concept aims to leverage the strengths of each language model and API to create a comprehensive language processing system. It allows developers to combine different models for tasks like question answering, text generation, translation, summarization, sentiment analysis, and more.
The core idea of the library is that we can “chain”“ together different components to create more advanced use cases around LLMs. Chains may consist of multiple components from several modules:
Prompt templates: Prompt templates are templates for different types of prompts. Like “chatbot” style templates, ELI5 question-answering, etc
LLMs: Large language models like GPT-3, BLOOM, etc
Agents: Agents use LLMs to decide what actions should be taken. Tools like web search or calculators can be used, and all are packaged into a logical loop of operations.
Memory: Short-term memory, long-term memory.
Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs.
.env¶Accessing the API requires an API key, which you can get by creating an account and heading here. When setting up an API key and using a .env file in your Python project, you follow these general steps:
Obtain an API key: If you're working with an external API or service that requires an API key, you need to obtain one from the provider. This usually involves signing up for an account and generating an API key specific to your project.
Create a .env file: In your project directory, create a new file and name it ".env". This file will store your API key and other sensitive information securely.
Store API key in .env: Open the .env file in a text editor and add a line to store your API key. The format should be API_KEY=your_api_key, where "API_KEY" is the name of the variable and "your_api_key" is the actual value of your API key. Make sure not to include any quotes or spaces around the value.
Load environment variables: In your Python code, you need to load the environment variables from the .env file before accessing them. Import the dotenv module and add the following code at the beginning of your script:
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
dotenvlibrary is a popular Python library that simplifies the process of loading environment variables from a .env file into your Python application. It allows you to store configuration variables separately from your code, making it easier to manage sensitive information such as API keys, database credentials, or other environment-specific settings.
from dotenv import load_dotenv
load_dotenv()
True
In LangChain, a QuickStart involves working with three key components: Prompt, Chain, and Agent.
With the Prompt, Chain, and Agent components working together, we can engage in interactive conversations with the language model. The Prompt sets the context or initiates the conversation, the Chain maintains the conversation history, and the Agent manages the communication between the user and the language model.
Using these components, we can build dynamic and interactive applications that involve back-and-forth interactions with the language model, allowing we to create conversational agents, chatbots, question-answering systems, and more.
To interact LangChain library with an OpenAI language model, we should:
Importing the Required Module: The code imports the LangChain library by using the statement from langchain import OpenAI.
Creating an OpenAI Instance: The code creates an instance of the OpenAI class and assigns it to the variable llm. This instance represents the connection to the OpenAI language model.
Setting the Temperature Parameter: The temperature parameter is passed to the OpenAI instance during its initialization. Temperature is a parameter that controls the randomness of the language model's output.
A higher temperature value (e.g., 0.9) makes the generated text more diverse and creative, while a lower value (e.g., 0.2) makes it more focused and deterministic.
from langchain import OpenAI
llm = OpenAI(temperature=0.1, ) #gpt3
By creating an instance of OpenAI and setting the desired temperature, we can now use the llm object to interact with the OpenAI language model. We can pass prompts or messages to the llm object, receive the generated responses, and customize the behavior of the language model using additional parameters and methods provided by the LangChain library.
Prompt refers to the initial input or instruction given to the language model to generate a response. It sets the context and provides guidance for the language model to produce relevant and coherent text.
In this example, the prompt asks for a suggestion of a good name for a brand specializing in local burgers.
prompt = "What is a good name for a brand that makes local burger?"
print(llm.predict(prompt))
Burger Towne.
The language model then uses its knowledge and training to generate a response that fits the given prompt. Notice every re-run it generate new answer.
The
llm.predict()function is called with the prompt as the input. This function sends the prompt to the language model and generates a response based on the given input. The generated text represents the language model's prediction or completion of the prompt.
We can also did it in other languages, let's try with Bahasa
print(llm.predict("Nama yang bagus untuk brand yang membuat pisang goreng mentai?"))
Mentai Pisang.
By simply providing a prompt in Bahasa (Indonesian language), we can obtain a generated text response in Bahasa as well. This showcases the versatility of language models like LangChain in understanding and generating text in various languages, allowing for multilingual applications and interactions.
LLM applications typically utilize a prompt template instead of directly inputting user queries into the LLM. This approach involves incorporating the user input into a larger text context known as a prompt template.
A prompt template is a structured format designed to generate prompts in a consistent manner. It consists of a text string, referred to as the "template," which can incorporate various parameters provided by the end user to create a dynamic prompt.
The prompt template can include:
In the previous example, the text passed to the model contained instructions to generate a brand name based on a given description. In our application, it would be convenient for users to only provide the description of their company or product without the need to explicitly provide instructions to the model.
To create a prompt template using LangChain, we begin by importing the PromptTemplate class from the langchain.prompts module. This class allows us to create and manipulate prompt templates.
from langchain.prompts import PromptTemplate
Create a prompt template: Use the PromptTemplate.from_template() method to create a PromptTemplate object from the template string.
In this case, the template string is "What is a good name for a brand that makes {product}?", where {product} acts as a placeholder for the product name.
# Create a prompt template
template_prompt = PromptTemplate.from_template("What is a good name for a brand that makes {rumah}?")
Format the prompt template: Use the .format() method of the PromptTemplate object to replace the placeholder in the template with the desired value. In this case, the placeholder {product} is replaced with the string "local burger".
# Format the prompt template
prompt = template_prompt.format(rumah="local burger")
# Print the prompt
print(prompt)
What is a good name for a brand that makes local burger?
Notice the instruction changes automatically based on user input, this instruction will be input to llm to generate the response. Let's get the response generated by the language model (llm) based on the given prompt.
print(llm.predict(prompt))
Burger Towne.
Because this is a template, it can handle more than one input, for example.
# defines a string template for a poem
template = "Write a {adjective} poem about {subject}"
# creates a prompt template
poem_template = PromptTemplate(
input_variables=["adjective", "subject"],
template=template,
)
print(poem_template)
input_variables=['adjective', 'subject'] output_parser=None partial_variables={} template='Write a {adjective} poem about {subject}' template_format='f-string' validate_template=True
Formats the template by replacing the placeholders {adjective} and {subject} with the provided values. The resulting string will be "Write a sad poem about ducks".
poem_template.format(adjective='sad', subject='ducks')
'Write a sad poem about ducks'
# generate a response
print(llm.predict(poem_template.format(adjective='sad', subject='ducks')))
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
The ducks are so sad, Their feathers are drooping, Their quacks are so quiet, Their heads are all looping. The pond is so still, The water so still, The ducks are so still, Their sadness to fill. The sky is so grey, The clouds are so low, The ducks are so lonely, No one to show. The ducks are so sad, Their feathers so dull, Their quacks so quiet, Their heads so full.
We can create a prompt template that acts as a naming consultant for new companies
# Define the prompt template
template = """
I want you to act as a naming consultant for new companies.
Here are some examples of good company names:
- search engine, Google
- social media, Facebook
- video sharing, YouTube
The name should be short, catchy and easy to remember.
What is a good name for a brand that makes {product}?
"""
# Create a PromptTemplate object
brand_template = PromptTemplate(
input_variables=["product"],
template=template,
)
# Format the prompt template with specific industry values
batik_prompt = brand_template.format(product='batik')
# Print the formatted prompt
print(llm.predict(batik_prompt))
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Batikista
By using prompt templates, we can easily generate prompts for various industries by filling in the specific values for the variables. This approach allows us to create dynamic and customizable prompts for the naming consultant application.
Now that we have our model and prompt template, we can combine them by creating a "chain". Chains provide a mechanism to link or connect multiple components, such as models, prompts, and other chains.
The most common type of chain is an LLMChain, which involves passing the input through a PromptTemplate and then to an LLM. We can create an LLMChain using our existing model and prompt template.
For example, if we want to generate a response using our template, our workflow would be as follows:
template_prompt# Create a prompt template
template_prompt = PromptTemplate.from_template("What is a good name for a brand that makes {product}?")
prompt = template_prompt.format(product="rendang mozarella")
print(prompt)
What is a good name for a brand that makes rendang mozarella?
llmprint(llm.predict(prompt))
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
Mozarella Rendang Co.
We can simplify the workflow by chaining (link) them up with Chains
# Import LLMChain class from langchain
from langchain.chains import LLMChain
# Chain the prompt template and llm
chain = LLMChain(llm=llm, prompt=template_prompt, verbose = True)
# Execute the chained model and prompt template
print(chain.run('rendang mozarella'))
> Entering new chain... Prompt after formatting: What is a good name for a brand that makes rendang mozarella? > Finished chain. Mozarella Rendang Co.
The chain.run() method generates a response from the LLM model based on the provided input.
By chaining the LLM model and the prompt template using the LLMChain class, we can conveniently pass inputs through the template and obtain contextually relevant responses from the model. This simple chain allows us to generate responses with just one line of code for each new input. Understanding the workings of this basic chain will serve as a solid foundation for working with more intricate chains.
In more complex workflows, it becomes crucial to have the ability to make decisions and choose actions based on the given context. This is where agents come into play.
Agents utilize a language model to determine which actions to take and in what sequence. They have a set of tools at their disposal, and they continually select, execute, and evaluate these tools until they arrive at the optimal solution. Agents provide a dynamic and adaptable approach to problem-solving within the LangChain framework, allowing for more sophisticated and flexible workflows.
To load an agent in LangChain, you need to consider the following components:
LLM/Chat model: This refers to the language model that powers the agent. It is responsible for generating responses based on the given input. You can choose from various pre-trained models or use your own custom models.
Tools: Tools are functions or methods that perform specific tasks within the agent's workflow. These can include actions like Google Search, Database lookup, Python REPL (Read-Eval-Print Loop), or even other chains. LangChain provides a set of predefined tools with their specifications, which you can refer to in the Tools documentation.
Agent name: The agent name is a string that identifies a supported agent class. Each agent class is parameterized by the prompt that the language model uses to determine the appropriate action to take. In this context, we will focus on using the standard supported agents, rather than implementing custom agents. You can explore the list of supported agents and their specifications to choose the most suitable one for your application.
For the specific example mentioned, we will utilize the wikipedia tool to query and retrieve responses based on Wikipedia information. This tool allows the agent to access relevant information from Wikipedia and provide informative responses based on the given input.
Import the required modules: The code starts by importing the necessary modules from LangChain, such as AgentType, initialize_agent, and load_tools. These modules provide the functionalities required to create and configure the agent.
from langchain.agents import AgentType, initialize_agent, load_tools
Define the language model for the agent: In this example, the llm_agent is initialized with the OpenAI class, which represents the language model. The temperature parameter determines the level of randomness in the generated responses.
# The language model we're going to use to control the agent.
llm_agent = OpenAI(temperature=0)
Load the tools: The load_tools function is used to load the desired tools for the agent. In this case, the tools "wikipedia" and "llm-math" are loaded.
The "wikipedia" tool allows the agent to access information from Wikipedia, while the "llm-math" tool utilizes the language model for mathematical operations.
# The tools we'll give the Agent access to. Note that the 'llm-math' tool uses an LLM, so we need to pass that in.
tools = load_tools(["wikipedia", "llm-math"], llm=llm_agent)
Initialize the agent: The initialize_agent function is called to create an agent instance. It takes the loaded tools, the language model (llm_agent), the agent type (AgentType.ZERO_SHOT_REACT_DESCRIPTION), and an optional verbose parameter. The agent type determines the behavior of the agent, such as generating responses based on descriptions or reacting to user inputs.
# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(tools, llm_agent, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent
AgentExecutor(memory=None, callbacks=None, callback_manager=None, verbose=True, tags=['zero-shot-react-description'], agent=ZeroShotAgent(llm_chain=LLMChain(memory=None, callbacks=None, callback_manager=None, verbose=False, tags=None, prompt=PromptTemplate(input_variables=['input', 'agent_scratchpad'], output_parser=None, partial_variables={}, template='Answer the following questions as best you can. You have access to the following tools:\n\nWikipedia: A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.\nCalculator: Useful for when you need to answer questions about math.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [Wikipedia, Calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}', template_format='f-string', validate_template=True), llm=OpenAI(cache=None, verbose=False, callbacks=None, callback_manager=None, tags=None, client=<class 'openai.api_resources.completion.Completion'>, model_name='text-davinci-003', temperature=0.0, max_tokens=256, top_p=1, frequency_penalty=0, presence_penalty=0, n=1, best_of=1, model_kwargs={}, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='', batch_size=20, request_timeout=None, logit_bias={}, max_retries=6, streaming=False, allowed_special=set(), disallowed_special='all', tiktoken_model_name=None), output_key='text', output_parser=NoOpOutputParser(), return_final_only=True, llm_kwargs={}), output_parser=MRKLOutputParser(), allowed_tools=['Wikipedia', 'Calculator']), tools=[WikipediaQueryRun(name='Wikipedia', description='A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.', args_schema=None, return_direct=False, verbose=False, callbacks=None, callback_manager=None, handle_tool_error=False, api_wrapper=WikipediaAPIWrapper(wiki_client=<module 'wikipedia' from 'C:\\Users\\USER\\anaconda3\\envs\\dss_llm\\lib\\site-packages\\wikipedia\\__init__.py'>, top_k_results=3, lang='en', load_all_available_meta=False, doc_content_chars_max=4000)), Tool(name='Calculator', description='Useful for when you need to answer questions about math.', args_schema=None, return_direct=False, verbose=False, callbacks=None, callback_manager=None, handle_tool_error=False, func=<bound method Chain.run of LLMMathChain(memory=None, callbacks=None, callback_manager=None, verbose=False, tags=None, llm_chain=LLMChain(memory=None, callbacks=None, callback_manager=None, verbose=False, tags=None, prompt=PromptTemplate(input_variables=['question'], output_parser=None, partial_variables={}, template='Translate a math problem into a expression that can be executed using Python\'s numexpr library. Use the output of running this code to answer the question.\n\nQuestion: ${{Question with math problem.}}\n```text\n${{single line mathematical expression that solves the problem}}\n```\n...numexpr.evaluate(text)...\n```output\n${{Output of running the code}}\n```\nAnswer: ${{Answer}}\n\nBegin.\n\nQuestion: What is 37593 * 67?\n```text\n37593 * 67\n```\n...numexpr.evaluate("37593 * 67")...\n```output\n2518731\n```\nAnswer: 2518731\n\nQuestion: 37593^(1/5)\n```text\n37593**(1/5)\n```\n...numexpr.evaluate("37593**(1/5)")...\n```output\n8.222831614237718\n```\nAnswer: 8.222831614237718\n\nQuestion: {question}\n', template_format='f-string', validate_template=True), llm=OpenAI(cache=None, verbose=False, callbacks=None, callback_manager=None, tags=None, client=<class 'openai.api_resources.completion.Completion'>, model_name='text-davinci-003', temperature=0.0, max_tokens=256, top_p=1, frequency_penalty=0, presence_penalty=0, n=1, best_of=1, model_kwargs={}, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='', batch_size=20, request_timeout=None, logit_bias={}, max_retries=6, streaming=False, allowed_special=set(), disallowed_special='all', tiktoken_model_name=None), output_key='text', output_parser=NoOpOutputParser(), return_final_only=True, llm_kwargs={}), llm=None, prompt=PromptTemplate(input_variables=['question'], output_parser=None, partial_variables={}, template='Translate a math problem into a expression that can be executed using Python\'s numexpr library. Use the output of running this code to answer the question.\n\nQuestion: ${{Question with math problem.}}\n```text\n${{single line mathematical expression that solves the problem}}\n```\n...numexpr.evaluate(text)...\n```output\n${{Output of running the code}}\n```\nAnswer: ${{Answer}}\n\nBegin.\n\nQuestion: What is 37593 * 67?\n```text\n37593 * 67\n```\n...numexpr.evaluate("37593 * 67")...\n```output\n2518731\n```\nAnswer: 2518731\n\nQuestion: 37593^(1/5)\n```text\n37593**(1/5)\n```\n...numexpr.evaluate("37593**(1/5)")...\n```output\n8.222831614237718\n```\nAnswer: 8.222831614237718\n\nQuestion: {question}\n', template_format='f-string', validate_template=True), input_key='question', output_key='answer')>, coroutine=<bound method Chain.arun of LLMMathChain(memory=None, callbacks=None, callback_manager=None, verbose=False, tags=None, llm_chain=LLMChain(memory=None, callbacks=None, callback_manager=None, verbose=False, tags=None, prompt=PromptTemplate(input_variables=['question'], output_parser=None, partial_variables={}, template='Translate a math problem into a expression that can be executed using Python\'s numexpr library. Use the output of running this code to answer the question.\n\nQuestion: ${{Question with math problem.}}\n```text\n${{single line mathematical expression that solves the problem}}\n```\n...numexpr.evaluate(text)...\n```output\n${{Output of running the code}}\n```\nAnswer: ${{Answer}}\n\nBegin.\n\nQuestion: What is 37593 * 67?\n```text\n37593 * 67\n```\n...numexpr.evaluate("37593 * 67")...\n```output\n2518731\n```\nAnswer: 2518731\n\nQuestion: 37593^(1/5)\n```text\n37593**(1/5)\n```\n...numexpr.evaluate("37593**(1/5)")...\n```output\n8.222831614237718\n```\nAnswer: 8.222831614237718\n\nQuestion: {question}\n', template_format='f-string', validate_template=True), llm=OpenAI(cache=None, verbose=False, callbacks=None, callback_manager=None, tags=None, client=<class 'openai.api_resources.completion.Completion'>, model_name='text-davinci-003', temperature=0.0, max_tokens=256, top_p=1, frequency_penalty=0, presence_penalty=0, n=1, best_of=1, model_kwargs={}, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='', batch_size=20, request_timeout=None, logit_bias={}, max_retries=6, streaming=False, allowed_special=set(), disallowed_special='all', tiktoken_model_name=None), output_key='text', output_parser=NoOpOutputParser(), return_final_only=True, llm_kwargs={}), llm=None, prompt=PromptTemplate(input_variables=['question'], output_parser=None, partial_variables={}, template='Translate a math problem into a expression that can be executed using Python\'s numexpr library. Use the output of running this code to answer the question.\n\nQuestion: ${{Question with math problem.}}\n```text\n${{single line mathematical expression that solves the problem}}\n```\n...numexpr.evaluate(text)...\n```output\n${{Output of running the code}}\n```\nAnswer: ${{Answer}}\n\nBegin.\n\nQuestion: What is 37593 * 67?\n```text\n37593 * 67\n```\n...numexpr.evaluate("37593 * 67")...\n```output\n2518731\n```\nAnswer: 2518731\n\nQuestion: 37593^(1/5)\n```text\n37593**(1/5)\n```\n...numexpr.evaluate("37593**(1/5)")...\n```output\n8.222831614237718\n```\nAnswer: 8.222831614237718\n\nQuestion: {question}\n', template_format='f-string', validate_template=True), input_key='question', output_key='answer')>)], return_intermediate_steps=False, max_iterations=15, max_execution_time=None, early_stopping_method='force', handle_parsing_errors=False)
agent.run('Who is the current president of The Republic of Indonesia')
> Entering new chain... I need to find out who the current president of Indonesia is Action: Wikipedia Action Input: "President of Indonesia" Observation: Page: President of Indonesia Summary: The president of the Republic of Indonesia (Indonesian: Presiden Republik Indonesia) is both the head of state and the head of government of the Republic of Indonesia. The president leads the executive branch of the Indonesian government and is the commander-in-chief of the Indonesian National Armed Forces. Since 2004, the president and vice president are directly elected to a five-year term, once renewable, allowing for a maximum of 10 years in office. Joko Widodo is the seventh and current president of Indonesia. He assumed office on 20 October 2014. Page: List of presidents of Indonesia Summary: The president is the head of state and also head of government of the Republic of Indonesia. The president leads the executive branch of the Indonesian government and is the commander-in-chief of the Indonesian National Armed Forces. Since 2004, the president and vice president are directly elected to a five-year term. The presidency was established during the formulation of the 1945 constitution by the Investigating Committee for Preparatory Work for Independence (BPUPK), a body established by the occupying Japanese 16th Army on 1 March 1945 to work on "preparations for independence in the region of the government of this island of Java". On 18 August 1945, the Preparatory Committee for Indonesian Independence (PPKI), which was created on 7 August to replace the BPUPK, selected Sukarno as the country's first president. Page: Vice President of Indonesia Summary: The vice president of the Republic of Indonesia (Indonesian: Wakil Presiden Republik Indonesia) is second-highest officer in the executive branch of the Indonesian government, after the president, and ranks first in the presidential line of succession. Since 2004, the president and vice president are directly elected to a five-year term. Ma'ruf Amin is the 13th and current vice president of Indonesia. He assumed office on 20 October 2019. Thought: I now know the final answer Final Answer: Joko Widodo is the current president of The Republic of Indonesia and Ma'ruf Amin is the current vice president. > Finished chain.
"Joko Widodo is the current president of The Republic of Indonesia and Ma'ruf Amin is the current vice president."
By executing these steps, we establish an agent that can utilize various tools and interact with the chosen language model to generate contextually relevant responses based on the given input.
As we know, LangChain is an open-source library that provides developers with powerful tools for building applications using Large Language Models (LLMs). In our previous example, we saw how we could use an LLM to generate responses based on a given question. However, there may be cases where we need to ask more specific questions related to our business domain. For instance, we might want to ask the LLM about our company's top revenue-generating product.
LLMs have certain limitations when it comes to specific contextual knowledge, as they are trained on a vast amount of general information. To overcome this limitation, we can provide additional documents or context to the LLM. The idea is to retrieve relevant documents related to our question from a corpus or database and then pass them along with the original question to the LLM. This allows the LLM to generate a response that is informed by the specific information contained in the retrieved documents.
These documents can come from various sources such as databases, PDF files, plain text files, or even information extracted from websites. By connecting and feeding these documents to the LLM, we can build a powerful Question-Answer System that leverages the LLM's language generation capabilities while incorporating domain-specific knowledge.
In this section, we will explore how to connect and feed a database and text information to LLM to build Question-Answer System that can provide contextually relevant answers to specific business-related questions.
Structured data is not only stored in database files; it can also be stored in other formats such as .xlsx and .csv, which represent data in a tabular form with columns and rows. In addition to providing agents to generate answers from databases using SQL based on natural language prompts, LangChain also offers agents to generate answers based on tabular structured data sources, such as CSV files. In this section, we will demonstrate how to utilize the agent for CSV data.
To begin, let's define the file path of our dataset about tourism in Indonesia, i have 4 of data that we can ask to be explained of what is the content.
filepath = "data_input/package_tourism.csv"
filepath2 = "data_input/tourism_rating.csv"
filepath3 = "data_input/tourism_with_id.csv"
filepath4 = "data_input/user.csv"
Next, we will create an agent specifically designed for working with CSV data. This agent will allow us to query and retrieve information from the tourism dataset.
from langchain.agents import create_csv_agent
agent = create_csv_agent(llm, filepath, verbose=True)
Then we just run ask the question about our data.
agent.run("Please explain this data to me")
> Entering new chain... Thought: I should look at the data and explain what it is Action: python_repl_ast Action Input: print(df.head()) Observation: Package City Place_Tourism1 Place_Tourism2 \ 0 1 Jakarta Pasar Tanah Abang Taman Ayodya 1 2 Jakarta Pasar Tanah Abang Pasar Taman Puring 2 3 Jakarta Perpustakaan Nasional Monas 3 4 Jakarta Pulau Tidung Pulau Bidadari 4 5 Jakarta Museum Satria Mandala Museum Wayang Place_Tourism3 Place_Tourism4 \ 0 Museum Tekstil NaN 1 Pasar Petak Sembilan NaN 2 Masjid Istiqlal NaN 3 Pulau Pari Pulau Pramuka 4 Museum Bahari Jakarta Museum Macan (Modern and Contemporary Art in N... Place_Tourism5 0 NaN 1 NaN 2 NaN 3 Pulau Pelangi 4 NaN Thought: I can now explain the data Final Answer: This dataframe contains information about tourist attractions in Jakarta. It includes the package number, city, and five different places of tourism. The places of tourism include Pasar Tanah Abang, Taman Ayodya, Museum Tekstil, Pasar Taman Puring, Pasar Petak Sembilan, Masjid Istiqlal, Pulau Tidung, Pulau Bidadari, Pulau Pari, Pulau Pramuka, Museum Satria Mandala, Museum Wayang, Museum Bahari Jakarta, and Museum Macan (Modern and Contemporary Art in Nusantara). > Finished chain.
'This dataframe contains information about tourist attractions in Jakarta. It includes the package number, city, and five different places of tourism. The places of tourism include Pasar Tanah Abang, Taman Ayodya, Museum Tekstil, Pasar Taman Puring, Pasar Petak Sembilan, Masjid Istiqlal, Pulau Tidung, Pulau Bidadari, Pulau Pari, Pulau Pramuka, Museum Satria Mandala, Museum Wayang, Museum Bahari Jakarta, and Museum Macan (Modern and Contemporary Art in Nusantara).'
To simplify the output, i will just turn the verbose to FALSE
agent2 = create_csv_agent(llm, filepath2)
agent2.run("Please explain this data to me")
'This dataframe contains three columns: User_Id, Place_Id, and Place_Ratings. The first column is the user ID, the second is the place ID, and the third is the rating given by the user for the place. The dataframe contains the ratings given by the user for each place.'
agent3 = create_csv_agent(llm, filepath3)
agent3.run("Please explain this data to me")
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
'This dataframe contains information about places in Jakarta, including the place name, description, category, city, price, rating, time minutes, coordinates, and unnamed columns.'
agent4 = create_csv_agent(llm, filepath4)
agent4.run("Please explain this data to me")
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
'This dataframe contains three columns: User_Id, Location, and Age. User_Id is a numerical identifier for each row, Location is the location of the user, and Age is the age of the user.'
I am curious about the tourism rating on each place, lets dive deeper on it
agent2.run("What places you consider to go if i have a week for holiday in Indonesia ?")
'The places with the highest ratings are 5, 258, 393, 208, 405, etc.'
Sometimes we have to combine several informations from some structured data to gain a better knowledge.
agent_combined = create_csv_agent(llm, [filepath2, filepath3], verbose=True)
agent_combined.run("What places you consider to go if i have a week for holiday in Indonesia ? please give the place name in detail !")
> Entering new chain...
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).
Thought: I need to find places with high ratings and short time to visit Action: python_repl_ast Action Input: df1[df1['Place_Ratings'] > 4.5].sort_values('Time_Minutes') Observation: KeyError: 'Time_Minutes' Thought: I need to join the two dataframes Action: python_repl_ast Action Input: df3 = df1.merge(df2, on='Place_Id') Observation: Thought: I need to find places with high ratings and short time to visit Action: python_repl_ast Action Input: df3[df3['Place_Ratings'] > 4.5].sort_values('Time_Minutes') Observation: User_Id Place_Id Place_Ratings Place_Name \ 5254 294 6 5 Taman Impian Jaya Ancol 5247 161 6 5 Taman Impian Jaya Ancol 5246 156 6 5 Taman Impian Jaya Ancol 5244 104 6 5 Taman Impian Jaya Ancol 5239 65 6 5 Taman Impian Jaya Ancol ... ... ... ... ... 9871 229 361 5 Wisata Kampung Krisan Clapar 9953 126 10 5 Pulau Tidung 9970 134 7 5 Kebun Binatang Ragunan 9980 277 7 5 Kebun Binatang Ragunan 9983 297 7 5 Kebun Binatang Ragunan Description Category \ 5254 Taman Impian Jaya Ancol merupakan sebuah objek... Taman Hiburan 5247 Taman Impian Jaya Ancol merupakan sebuah objek... Taman Hiburan 5246 Taman Impian Jaya Ancol merupakan sebuah objek... Taman Hiburan 5244 Taman Impian Jaya Ancol merupakan sebuah objek... Taman Hiburan 5239 Taman Impian Jaya Ancol merupakan sebuah objek... Taman Hiburan ... ... ... 9871 Wisata Kampung Krisan Gemah Ripah di Dusun Cla... Taman Hiburan 9953 Pulau Tidung adalah salah satu kelurahan di ke... Bahari 9970 Kebun Binatang Ragunan adalah sebuah kebun bin... Cagar Alam 9980 Kebun Binatang Ragunan adalah sebuah kebun bin... Cagar Alam 9983 Kebun Binatang Ragunan adalah sebuah kebun bin... Cagar Alam City Price Rating Time_Minutes \ 5254 Jakarta 25000 4.5 10.0 5247 Jakarta 25000 4.5 10.0 5246 Jakarta 25000 4.5 10.0 5244 Jakarta 25000 4.5 10.0 5239 Jakarta 25000 4.5 10.0 ... ... ... ... ... 9871 Semarang 10000 4.1 NaN 9953 Jakarta 150000 4.5 NaN 9970 Jakarta 4000 4.5 NaN 9980 Jakarta 4000 4.5 NaN 9983 Jakarta 4000 4.5 NaN Coordinate Lat Long \ 5254 {'lat': -6.117333200000001, 'lng': 106.8579951} -6.117333 106.857995 5247 {'lat': -6.117333200000001, 'lng': 106.8579951} -6.117333 106.857995 5246 {'lat': -6.117333200000001, 'lng': 106.8579951} -6.117333 106.857995 5244 {'lat': -6.117333200000001, 'lng': 106.8579951} -6.117333 106.857995 5239 {'lat': -6.117333200000001, 'lng': 106.8579951} -6.117333 106.857995 ... ... ... ... 9871 {'lat': -7.214158199999999, 'lng': 110.3769541} -7.214158 110.376954 9953 {'lat': -5.803205300000001, 'lng': 106.5237907} -5.803205 106.523791 9970 {'lat': -6.3124593, 'lng': 106.8201865} -6.312459 106.820187 9980 {'lat': -6.3124593, 'lng': 106.8201865} -6.312459 106.820187 9983 {'lat': -6.3124593, 'lng': 106.8201865} -6.312459 106.820187 Unnamed: 11 Unnamed: 12 5254 NaN 6 5247 NaN 6 5246 NaN 6 5244 NaN 6 5239 NaN 6 ... ... ... 9871 NaN 361 9953 NaN 10 9970 NaN 7 9980 NaN 7 9983 NaN 7 [2021 rows x 15 columns] Thought:
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.. Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-MFMvi5tIIFCl6U2BW5yfuALH on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
I now know the final answer Final Answer: Monumen Nasional, Kota Tua Jakarta, Dunia Fantasi, Taman Mini Indonesia Indah (TMII), Atlantis Water Adventure, Taman Impian Jaya Ancol, Wisata Kampung Krisan Clapar, Pulau Tidung, Kebun Binatang Ragunan. > Finished chain.
'Monumen Nasional, Kota Tua Jakarta, Dunia Fantasi, Taman Mini Indonesia Indah (TMII), Atlantis Water Adventure, Taman Impian Jaya Ancol, Wisata Kampung Krisan Clapar, Pulau Tidung, Kebun Binatang Ragunan.'
From these actions, we can ask plenty of questions and gain knowledge from the data given, this is just the beginning of the vast world in LLM with many more awaits us to find out.