*Building Large Language Models for Business Applications*

Prelude

Large Language Models (LLMs) is an advanced type of language model that represent a breakthrough in the field of natural language processing (NLP). These models are designed to understand and generate human-like text by leveraging the power of deep learning algorithms and massive amounts of data.

If you've ever chatted with a virtual assistant or interacted with an AI customer service agent, you might have interacted with a large language model without even realizing it. These models have a wide range of applications, from chatbots to language translation to content creation.

Some of the most impressive large language models are developed by OpenAI. Their GPT-3 model, for example, has over 175 billion parameters%20and%20Turing%20NLG.) and is able to perform tasks like summarization, question-answering, and even creative writing.

How a Large Language Model was Built?

The architecture of LLMs is based on the Transformer model, which has revolutionized NLP tasks. The Transformer model utilizes a self-attention mechanism that allows the model to focus on different parts of the input sequence, capturing dependencies and relationships between words more effectively. This architecture enables LLMs to generate coherent and contextually relevant responses, making them valuable tools for a wide range of applications.

A large-scale transformer model known as a “large language model” is typically too massive to run on a single computer and is, therefore, provided as a service over an API or web interface. These models are trained on vast amounts of text data from sources such as books, articles, websites, and numerous other forms of written content. By analyzing the statistical relationships between words, phrases, and sentences through this training process, the models can generate coherent and contextually relevant responses to prompts or queries.

ChatGPT’s GPT-3 model, for instance, was trained on massive amounts of internet text data, giving it the ability to understand various languages and possess knowledge of diverse topics. As a result, it can produce text in multiple styles. While its capabilities may seem impressive, including translation, text summarization, and question-answering, they are not surprising, given that these functions operate using special “grammars” that match up with prompts.

Understanding what is Transformer

The Transformer is a type of deep learning architecture that has revolutionized the field of natural language processing. It was introduced in the paper "Attention Is All You Need" by Vaswani et al. (2017). The Transformer model employs self-attention mechanisms to capture dependencies between words in a sentence, enabling it to learn contextual relationships and generate coherent and contextually relevant text.

The Transformer architecture excels at handling text data which is inherently sequential. They take a text sequence as input and produce another text sequence as output. eg. to translate an input French sentence to English.

The Transformer architecture consists of two main components: the encoder and the decoder. The encoder processes the input sequence and generates a representation, which is then passed to the decoder. The decoder generates the output sequence based on the encoder's representation and previous outputs.

Here are the key components and concepts of the Transformer architecture:

  1. Positional Encoding: Transformers incorporate positional encoding to provide the model with information about the order of words in the input sequence. Positional encoding is usually added to the input embeddings and allows the model to differentiate between the positions of words.

positional_encoding

  1. Self-Attention Mechanism: Self-attention allows each word in the input sequence to attend to all other words. It computes the attention weight between each pair of words and uses them to generate a weighted sum of the word embeddings. This mechanism enables the model to capture long-range dependencies and learn contextual relationships effectively.

self_attention

Pre-training and Fine-tuning of Language Models

Pre-training and fine-tuning are two key steps in the training process of language models, including Large Language Models (LLMs).

Pre-training: In the pre-training phase, a language model is trained on a large corpus of unlabeled text data. During this phase, the model learns to predict missing words in sentences based on the surrounding context. It develops an understanding of language patterns, grammar, and contextual relationships. The pre-training process typically involves techniques like masked language modeling, where certain words are randomly masked and the model learns to predict them based on the remaining context.

Fine-tuning: After pre-training, the language model is fine-tuned on specific labeled datasets for specific downstream tasks. Fine-tuning involves training the pre-trained model on labeled data related to a particular task, such as question answering, sentiment analysis, or text classification. This process allows the model to adapt to the specific task by learning task-specific patterns and improving its performance. Fine-tuning is performed on a smaller dataset, which is typically task-specific and labeled by human experts.

In the context of Large Language Models (LLMs), the terms "pre-trained" and "fine-tuned" refer to two stages in the model development process. This two-step process offers several advantages:

LLMs, such as GPT-3, GPT-2, and BERT, are examples of large-scale language models that have undergone extensive pre-training and fine-tuning processes. They have been trained on vast amounts of text data and have a large number of parameters. This pre-training and fine-tuning approach allows LLMs to capture complex language patterns, generate coherent text, and perform well on a wide range of natural language processing tasks.

Popular Large Language Models (LLMs) are advanced models that have gained significant attention in the field of natural language processing. They have been trained on massive amounts of text data and have a large number of parameters, allowing them to capture complex language patterns and generate coherent text.

Here are some examples of popular LLMs:

  1. GPT-3 (Generative Pre-trained Transformer 3): GPT-3 is a state-of-the-art language model developed by OpenAI. It is renowned for its impressive size, consisting of 175 billion parameters. GPT-3 has been trained on a vast amount of internet text data, enabling it to understand and generate human-like text. It can perform a wide range of natural language processing tasks, including language translation, text completion, sentiment analysis, and more. GPT-3 has shown remarkable capabilities in generating coherent and contextually relevant responses, making it a powerful tool for various applications.

  2. GPT-2 (Generative Pre-trained Transformer 2): GPT-2 is the predecessor to GPT-3, also developed by OpenAI. Although smaller in size with 1.5 billion parameters, GPT-2 still delivers impressive language generation capabilities. It has been trained on diverse internet text sources, allowing it to produce high-quality text in a variety of styles and topics. GPT-2 is widely used for tasks such as text completion, text generation, and language understanding.

  3. BERT (Bidirectional Encoder Representations from Transformers): BERT is a groundbreaking language model developed by Google. It introduced the concept of bidirectional training, which significantly improved the understanding of context in natural language processing. BERT has been trained on large-scale text data and employs a transformer architecture. It excels in various language understanding tasks, including question-answering, sentiment analysis, named entity recognition, and more. BERT has set new benchmarks in several natural language processing tasks and has been widely adopted in both research and industry.

Capabilities and limitations of Large Language Models

Large language models like GPT-3, GPT-2, and BERT exhibit impressive capabilities in tasks such as :

They can understand complex language structures, generate coherent text, and perform well on a range of natural language processing tasks.

However, it's essential to acknowledge the limitations of LLMs:

Introduction to LangChain

LangChain is a framework for developing applications powered by language models that refers to the integration of multiple language models and APIs to create a powerful and flexible language processing pipeline. It involves connecting different language models, such as OpenAI's GPT-3 or GPT-2, with other tools and APIs to enhance their functionality and address specific business needs.

The LangChain concept aims to leverage the strengths of each language model and API to create a comprehensive language processing system. It allows developers to combine different models for tasks like question answering, text generation, translation, summarization, sentiment analysis, and more.

The core idea of the library is that we can “chain”“ together different components to create more advanced use cases around LLMs. Chains may consist of multiple components from several modules:

  1. Prompt templates: Prompt templates are templates for different types of prompts. Like “chatbot” style templates, ELI5 question-answering, etc

  2. LLMs: Large language models like GPT-3, BLOOM, etc

  3. Agents: Agents use LLMs to decide what actions should be taken. Tools like web search or calculators can be used, and all are packaged into a logical loop of operations.

  4. Memory: Short-term memory, long-term memory.

Environment Set-up

Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs.

Setting API key and .env

Accessing the API requires an API key, which you can get by creating an account and heading here. When setting up an API key and using a .env file in your Python project, you follow these general steps:

  1. Obtain an API key: If you're working with an external API or service that requires an API key, you need to obtain one from the provider. This usually involves signing up for an account and generating an API key specific to your project.

  2. Create a .env file: In your project directory, create a new file and name it ".env". This file will store your API key and other sensitive information securely.

  3. Store API key in .env: Open the .env file in a text editor and add a line to store your API key. The format should be API_KEY=your_api_key, where "API_KEY" is the name of the variable and "your_api_key" is the actual value of your API key. Make sure not to include any quotes or spaces around the value.

  4. Load environment variables: In your Python code, you need to load the environment variables from the .env file before accessing them. Import the dotenv module and add the following code at the beginning of your script:

from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

dotenv library is a popular Python library that simplifies the process of loading environment variables from a .env file into your Python application. It allows you to store configuration variables separately from your code, making it easier to manage sensitive information such as API keys, database credentials, or other environment-specific settings.

LangChain Quickstart

In LangChain, a QuickStart involves working with three key components: Prompt, Chain, and Agent.

With the Prompt, Chain, and Agent components working together, we can engage in interactive conversations with the language model. The Prompt sets the context or initiates the conversation, the Chain maintains the conversation history, and the Agent manages the communication between the user and the language model.

Using these components, we can build dynamic and interactive applications that involve back-and-forth interactions with the language model, allowing we to create conversational agents, chatbots, question-answering systems, and more.

To interact LangChain library with an OpenAI language model, we should:

  1. Importing the Required Module: The code imports the LangChain library by using the statement from langchain import OpenAI.

  2. Creating an OpenAI Instance: The code creates an instance of the OpenAI class and assigns it to the variable llm. This instance represents the connection to the OpenAI language model.

  3. Setting the Temperature Parameter: The temperature parameter is passed to the OpenAI instance during its initialization. Temperature is a parameter that controls the randomness of the language model's output.

    A higher temperature value (e.g., 0.9) makes the generated text more diverse and creative, while a lower value (e.g., 0.2) makes it more focused and deterministic.

By creating an instance of OpenAI and setting the desired temperature, we can now use the llm object to interact with the OpenAI language model. We can pass prompts or messages to the llm object, receive the generated responses, and customize the behavior of the language model using additional parameters and methods provided by the LangChain library.

Prompt

Basic Prompt

Prompt refers to the initial input or instruction given to the language model to generate a response. It sets the context and provides guidance for the language model to produce relevant and coherent text.

In this example, the prompt asks for a suggestion of a good name for a brand specializing in local burgers.

The language model then uses its knowledge and training to generate a response that fits the given prompt. Notice every re-run it generate new answer.

The llm.predict() function is called with the prompt as the input. This function sends the prompt to the language model and generates a response based on the given input. The generated text represents the language model's prediction or completion of the prompt.

We can also did it in other languages, let's try with Bahasa

By simply providing a prompt in Bahasa (Indonesian language), we can obtain a generated text response in Bahasa as well. This showcases the versatility of language models like LangChain in understanding and generating text in various languages, allowing for multilingual applications and interactions.

Prompt Templates

LLM applications typically utilize a prompt template instead of directly inputting user queries into the LLM. This approach involves incorporating the user input into a larger text context known as a prompt template.

A prompt template is a structured format designed to generate prompts in a consistent manner. It consists of a text string, referred to as the "template," which can incorporate various parameters provided by the end user to create a dynamic prompt.

The prompt template can include:

In the previous example, the text passed to the model contained instructions to generate a brand name based on a given description. In our application, it would be convenient for users to only provide the description of their company or product without the need to explicitly provide instructions to the model.

To create a prompt template using LangChain, we begin by importing the PromptTemplate class from the langchain.prompts module. This class allows us to create and manipulate prompt templates.

Create a prompt template: Use the PromptTemplate.from_template() method to create a PromptTemplate object from the template string.

In this case, the template string is "What is a good name for a brand that makes {product}?", where {product} acts as a placeholder for the product name.

Format the prompt template: Use the .format() method of the PromptTemplate object to replace the placeholder in the template with the desired value. In this case, the placeholder {product} is replaced with the string "local burger".

Notice the instruction changes automatically based on user input, this instruction will be input to llm to generate the response. Let's get the response generated by the language model (llm) based on the given prompt.

Because this is a template, it can handle more than one input, for example.

Formats the template by replacing the placeholders {adjective} and {subject} with the provided values. The resulting string will be "Write a sad poem about ducks".

We can create a prompt template that acts as a naming consultant for new companies

By using prompt templates, we can easily generate prompts for various industries by filling in the specific values for the variables. This approach allows us to create dynamic and customizable prompts for the naming consultant application.

Chain

Now that we have our model and prompt template, we can combine them by creating a "chain". Chains provide a mechanism to link or connect multiple components, such as models, prompts, and other chains.

The most common type of chain is an LLMChain, which involves passing the input through a PromptTemplate and then to an LLM. We can create an LLMChain using our existing model and prompt template.

For example, if we want to generate a response using our template, our workflow would be as follows:

  1. Create the prompt based on input with template_prompt
  1. Generate response from prompt with llm

We can simplify the workflow by chaining (link) them up with Chains

The chain.run() method generates a response from the LLM model based on the provided input.

By chaining the LLM model and the prompt template using the LLMChain class, we can conveniently pass inputs through the template and obtain contextually relevant responses from the model. This simple chain allows us to generate responses with just one line of code for each new input. Understanding the workings of this basic chain will serve as a solid foundation for working with more intricate chains.

Agents

In more complex workflows, it becomes crucial to have the ability to make decisions and choose actions based on the given context. This is where agents come into play.

Agents utilize a language model to determine which actions to take and in what sequence. They have a set of tools at their disposal, and they continually select, execute, and evaluate these tools until they arrive at the optimal solution. Agents provide a dynamic and adaptable approach to problem-solving within the LangChain framework, allowing for more sophisticated and flexible workflows.

To load an agent in LangChain, you need to consider the following components:

For the specific example mentioned, we will utilize the wikipedia tool to query and retrieve responses based on Wikipedia information. This tool allows the agent to access relevant information from Wikipedia and provide informative responses based on the given input.

Import the required modules: The code starts by importing the necessary modules from LangChain, such as AgentType, initialize_agent, and load_tools. These modules provide the functionalities required to create and configure the agent.

Define the language model for the agent: In this example, the llm_agent is initialized with the OpenAI class, which represents the language model. The temperature parameter determines the level of randomness in the generated responses.

Load the tools: The load_tools function is used to load the desired tools for the agent. In this case, the tools "wikipedia" and "llm-math" are loaded.

The "wikipedia" tool allows the agent to access information from Wikipedia, while the "llm-math" tool utilizes the language model for mathematical operations.

Initialize the agent: The initialize_agent function is called to create an agent instance. It takes the loaded tools, the language model (llm_agent), the agent type (AgentType.ZERO_SHOT_REACT_DESCRIPTION), and an optional verbose parameter. The agent type determines the behavior of the agent, such as generating responses based on descriptions or reacting to user inputs.

By executing these steps, we establish an agent that can utilize various tools and interact with the chosen language model to generate contextually relevant responses based on the given input.

Build Question Answering System

Introduction to Question-Answer System

As we know, LangChain is an open-source library that provides developers with powerful tools for building applications using Large Language Models (LLMs). In our previous example, we saw how we could use an LLM to generate responses based on a given question. However, there may be cases where we need to ask more specific questions related to our business domain. For instance, we might want to ask the LLM about our company's top revenue-generating product.

LLMs have certain limitations when it comes to specific contextual knowledge, as they are trained on a vast amount of general information. To overcome this limitation, we can provide additional documents or context to the LLM. The idea is to retrieve relevant documents related to our question from a corpus or database and then pass them along with the original question to the LLM. This allows the LLM to generate a response that is informed by the specific information contained in the retrieved documents.

These documents can come from various sources such as databases, PDF files, plain text files, or even information extracted from websites. By connecting and feeding these documents to the LLM, we can build a powerful Question-Answer System that leverages the LLM's language generation capabilities while incorporating domain-specific knowledge.

In this section, we will explore how to connect and feed a database and text information to LLM to build Question-Answer System that can provide contextually relevant answers to specific business-related questions.

Structured Data

Connecting to CSV

Structured data is not only stored in database files; it can also be stored in other formats such as .xlsx and .csv, which represent data in a tabular form with columns and rows. In addition to providing agents to generate answers from databases using SQL based on natural language prompts, LangChain also offers agents to generate answers based on tabular structured data sources, such as CSV files. In this section, we will demonstrate how to utilize the agent for CSV data.

To begin, let's define the file path of our dataset about tourism in Indonesia, i have 4 of data that we can ask to be explained of what is the content.

Next, we will create an agent specifically designed for working with CSV data. This agent will allow us to query and retrieve information from the tourism dataset.

Then we just run ask the question about our data.

To simplify the output, i will just turn the verbose to FALSE

I am curious about the tourism rating on each place, lets dive deeper on it

Sometimes we have to combine several informations from some structured data to gain a better knowledge.

From these actions, we can ask plenty of questions and gain knowledge from the data given, this is just the beginning of the vast world in LLM with many more awaits us to find out.

Reference