# LangChain: Introduction and Getting Started

> At its core, LangChain is a framework built around LLMs. We can use it for chatbots, Generative Question-Answering (GQA), summarization, and much more. The core idea of the library is that we can “chain” together different components to create more advanced use cases around LLMs.

**L**arge **L**anguage **M**odels (LLMs) entered the world stage with the release of OpenAI’s GPT-3 in 2020 [1]. Since then, they’ve enjoyed a steady growth in popularity.

That is until late 2022. Interest in LLMs and the broader discipline of generative AI has skyrocketed. The reasons for this are likely the continuous upward momentum of significant advances in LLMs.

We saw the dramatic news about Google’s _“sentient”_ LaMDA chatbot. The first high-performance and _open-source_ LLM called BLOOM was released. OpenAI released their next-generation text embedding model and the next generation of _“GPT-3.5”_ models.

After all these giant leaps forward in the LLM space, OpenAI released _ChatGPT_ — thrusting LLMs into the spotlight.

[LangChain](https://github.com/hwchase17/langchain) appeared around the same time. Its creator, Harrison Chase, made the first commit in late October 2022. Leaving a short couple of months of development before getting caught in the LLM wave.

Despite being early days for the library, it is already packed full of incredible features for building amazing tools around the core of LLMs. In this article, we’ll introduce the library and start with the most straightforward component offered by LangChain — LLMs.

[Video](https://www.youtube.com/watch?v=nE2skSRWTTs)


---

## LangChain

At its core, LangChain is a framework built around LLMs. We can use it for chatbots, [Generative](https://www.pinecone.io/learn/openai-gen-qa/) [Question-Answering (GQA)](https://www.pinecone.io/learn/openai-gen-qa/), summarization, and much more.

The core idea of the library is that we can _“chain”_ together different components to create more advanced use cases around LLMs. Chains may consist of multiple components from several modules:

- **Prompt templates**: Prompt templates are templates for different types of prompts. Like “chatbot” style templates, ELI5 question-answering, etc
- **LLMs**: Large language models like GPT-3, BLOOM, etc
- **Agents**: Agents use LLMs to decide what actions should be taken. Tools like web search or calculators can be used, and all are packaged into a logical loop of operations.
- **Memory**: Short-term memory, long-term memory.

We will dive into each of these in much more detail in upcoming chapters of the LangChain handbook.

For now, we’ll start with the basics behind **prompt templates** and **LLMs**. We’ll also explore two LLM options available from the library, using models from _Hugging Face Hub_ or _OpenAI_.


---

Start using Pinecone for free


---

## Our First Prompt Templates

Prompts being input to LLMs are often structured in different ways so that we can get different results. For Q&A, we could take a user’s question and reformat it for different Q&A styles, like conventional Q&A, a bullet list of answers, or even a summary of problems relevant to the given question.

### Creating Prompts in LangChain

Let’s put together a simple question-answering prompt template. We first need to install the `langchain` library.

`!pip install langchain`

---

_Follow along with the code via_ _[the walkthrough](https://github.com/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/00-langchain-intro.ipynb)!_

---

From here, we import the `PromptTemplate` class and initialize a template like so:

```python
from langchain import PromptTemplate

template = """Question: {question}

Answer: """
prompt = PromptTemplate(
        template=template,
    input_variables=['question']
)

# user question
question = "Which NFL team won the Super Bowl in the 2010 season?"
```

When using these prompt template with the given `question` we will get:

`Question: Which NFL team won the Super Bowl in the 2010 season? Answer:`

For now, that’s all we need. We’ll use the same prompt template across both Hugging Face Hub and OpenAI LLM generations.

## Hugging Face Hub LLM

The Hugging Face Hub endpoint in LangChain connects to the Hugging Face Hub and runs the models via their free inference endpoints. We need a [Hugging Face account and API key](https://huggingface.co/settings/tokens) to use these endpoints.

Once you have an API key, we add it to the `HUGGINGFACEHUB_API_TOKEN` environment variable. We can do this with Python like so:

```python
import os

os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'HF_API_KEY'
```

Next, we must install the `huggingface_hub` library via Pip.

`!pip install huggingface_hub`

Now we can generate text using a Hub model. We’ll use [google/flan-t5-x1](https://huggingface.co/google/flan-t5-xl).

---

_The default Hugging Face Hub inference APIs do not use specialized hardware and, therefore, can be slow. They are also not suitable for running larger models like_ _`bigscience/bloom-560m`_ _or_ _`google/flan-t5-xxl`_ _(note_ _`xxl`_ _vs._ _`xl`)._

---

```json
{
  "_key": "4ebbd3c62dfb",
  "_type": "colabBlock",
  "jsonContent": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"green bay packers\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from langchain import HuggingFaceHub, LLMChain\\n\",\n    \"\\n\",\n    \"# initialize Hub LLM\\n\",\n    \"hub_llm = HuggingFaceHub(\\n\",\n    \"        repo_id='google/flan-t5-xl',\\n\",\n    \"    model_kwargs={'temperature':1e-10}\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"# create prompt template > LLM chain\\n\",\n    \"llm_chain = LLMChain(\\n\",\n    \"    prompt=prompt,\\n\",\n    \"    llm=hub_llm\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"# ask the user question about NFL 2010\\n\",\n    \"print(llm_chain.run(question))\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"ml\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.9.12\"\n  },\n  \"orig_nbformat\": 4,\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"b8e7999f96e1b425e2d542f21b571f5a4be3e97158b0b46ea1b2500df63956ce\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}"
}
```

For this question, we get the correct answer of `"green bay packers"`.

### Asking Multiple Questions

If we’d like to ask multiple questions, we can try two approaches:

1. Iterate through all questions using the `generate` method, answering them one at a time.
2. Place all questions into a single prompt for the LLM; this will only work for more advanced LLMs.

Starting with option (1), let’s see how to use the `generate` method:

```json
{
  "_key": "144650b2ec74",
  "_type": "colabBlock",
  "jsonContent": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"LLMResult(generations=[[Generation(text='green bay packers', generation_info=None)], [Generation(text='184', generation_info=None)], [Generation(text='john glenn', generation_info=None)], [Generation(text='one', generation_info=None)]], llm_output=None)\"\n      ]\n     },\n     \"execution_count\": 4,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"qs = [\\n\",\n    \"    {'question': \\\"Which NFL team won the Super Bowl in the 2010 season?\\\"},\\n\",\n    \"    {'question': \\\"If I am 6 ft 4 inches, how tall am I in centimeters?\\\"},\\n\",\n    \"    {'question': \\\"Who was the 12th person on the moon?\\\"},\\n\",\n    \"    {'question': \\\"How many eyes does a blade of grass have?\\\"}\\n\",\n    \"]\\n\",\n    \"res = llm_chain.generate(qs)\\n\",\n    \"res\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"ml\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.9.12\"\n  },\n  \"orig_nbformat\": 4,\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"b8e7999f96e1b425e2d542f21b571f5a4be3e97158b0b46ea1b2500df63956ce\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}"
}
```

Here we get bad results except for the first question. This is simply a limitation of the LLM being used.

If the model cannot answer individual questions accurately, grouping all queries into a single prompt is unlikely to work. However, for the sake of experimentation, let’s try it.

```json
{
  "_key": "54f49c14ac35",
  "_type": "colabBlock",
  "jsonContent": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"If I am 6 ft 4 inches, how tall am I in centimeters\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"multi_template = \\\"\\\"\\\"Answer the following questions one at a time.\\n\",\n    \"\\n\",\n    \"Questions:\\n\",\n    \"{questions}\\n\",\n    \"\\n\",\n    \"Answers:\\n\",\n    \"\\\"\\\"\\\"\\n\",\n    \"long_prompt = PromptTemplate(template=multi_template, input_variables=[\\\"questions\\\"])\\n\",\n    \"\\n\",\n    \"llm_chain = LLMChain(\\n\",\n    \"    prompt=long_prompt,\\n\",\n    \"    llm=flan_t5\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"qs_str = (\\n\",\n    \"    \\\"Which NFL team won the Super Bowl in the 2010 season?\\\\n\\\" +\\n\",\n    \"    \\\"If I am 6 ft 4 inches, how tall am I in centimeters?\\\\n\\\" +\\n\",\n    \"    \\\"Who was the 12th person on the moon?\\\" +\\n\",\n    \"    \\\"How many eyes does a blade of grass have?\\\"\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"print(llm_chain.run(qs_str))\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"ml\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.9.12\"\n  },\n  \"orig_nbformat\": 4,\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"b8e7999f96e1b425e2d542f21b571f5a4be3e97158b0b46ea1b2500df63956ce\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}"
}
```

As expected, the results are not helpful. We’ll see later that more powerful LLMs can do this.

## OpenAI LLMs

The OpenAI endpoints in LangChain connect to OpenAI directly or via Azure. We need an [OpenAI account and API key](https://beta.openai.com/account/api-keys) to use these endpoints.

Once you have an API key, we add it to the `OPENAI_API_TOKEN` environment variable. We can do this with Python like so:

```python
import os

os.environ['OPENAI_API_TOKEN'] = 'OPENAI_API_KEY'
```

Next, we must install the `openai` library via Pip.

`!pip install openai`

Now we can generate text using OpenAI’s GPT-3 generation (or _completion_) models. We’ll use [text-davinci-003](https://huggingface.co/google/flan-t5-xl).

```python
from langchain.llms import OpenAI

davinci = OpenAI(model_name='text-davinci-003')
```

_Alternatively, if you’re using OpenAI via Azure, you can do:_

```python
from langchain.llms import AzureOpenAI

llm = AzureOpenAI(
    deployment_name="your-azure-deployment", 
    model_name="text-davinci-003"
)
```

We’ll use the same simple question-answer prompt template as before with the Hugging Face example. The only change is that we now pass our OpenAI LLM `davinci`:

```json
{
  "_key": "89b22db11f51",
  "_type": "colabBlock",
  "jsonContent": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 15,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \" The Green Bay Packers won the Super Bowl in the 2010 season.\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"llm_chain = LLMChain(\\n\",\n    \"    prompt=prompt,\\n\",\n    \"    llm=davinci\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"print(llm_chain.run(question))\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"ml\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.9.12\"\n  },\n  \"orig_nbformat\": 4,\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"b8e7999f96e1b425e2d542f21b571f5a4be3e97158b0b46ea1b2500df63956ce\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}"
}
```

As expected, we’re getting the correct answer. We can do the same for multiple questions using `generate`:

```json
{
  "_key": "162ce3ace116",
  "_type": "colabBlock",
  "jsonContent": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 16,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"LLMResult(generations=[[Generation(text=' The Green Bay Packers won the Super Bowl in the 2010 season.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' 193.04 centimeters', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' Charles Duke was the 12th person on the moon. He was part of the Apollo 16 mission in 1972.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' A blade of grass does not have any eyes.', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'total_tokens': 124, 'prompt_tokens': 75, 'completion_tokens': 49}})\"\n      ]\n     },\n     \"execution_count\": 16,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"qs = [\\n\",\n    \"    {'question': \\\"Which NFL team won the Super Bowl in the 2010 season?\\\"},\\n\",\n    \"    {'question': \\\"If I am 6 ft 4 inches, how tall am I in centimeters?\\\"},\\n\",\n    \"    {'question': \\\"Who was the 12th person on the moon?\\\"},\\n\",\n    \"    {'question': \\\"How many eyes does a blade of grass have?\\\"}\\n\",\n    \"]\\n\",\n    \"llm_chain.generate(qs)\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"ml\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.9.12\"\n  },\n  \"orig_nbformat\": 4,\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"b8e7999f96e1b425e2d542f21b571f5a4be3e97158b0b46ea1b2500df63956ce\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}"
}
```

Most of our results are correct or have a degree of truth. The model undoubtedly functions better than the `google/flan-t5-xl` model. As before, let’s try feeding all questions into the model at once.

```json
{
  "_key": "9afb32fd3a5d",
  "_type": "colabBlock",
  "jsonContent": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 17,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"The New Orleans Saints won the Super Bowl in the 2010 season.\\n\",\n      \"6 ft 4 inches is 193 centimeters.\\n\",\n      \"The 12th person on the moon was Harrison Schmitt.\\n\",\n      \"A blade of grass does not have eyes.\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"llm_chain = LLMChain(\\n\",\n    \"    prompt=long_prompt,\\n\",\n    \"    llm=davinci\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"qs_str = (\\n\",\n    \"    \\\"Which NFL team won the Super Bowl in the 2010 season?\\\\n\\\" +\\n\",\n    \"    \\\"If I am 6 ft 4 inches, how tall am I in centimeters?\\\\n\\\" +\\n\",\n    \"    \\\"Who was the 12th person on the moon?\\\" +\\n\",\n    \"    \\\"How many eyes does a blade of grass have?\\\"\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"print(llm_chain.run(qs_str))\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"ml\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.9.12\"\n  },\n  \"orig_nbformat\": 4,\n  \"vscode\": {\n   \"interpreter\": {\n    \"hash\": \"b8e7999f96e1b425e2d542f21b571f5a4be3e97158b0b46ea1b2500df63956ce\"\n   }\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}"
}
```

As we keep rerunning the query, the model will occasionally make errors, but at other times manage to get all answers correct.

---

That’s it for our introduction to LangChain — a library that allows us to build more advanced apps around LLMs like OpenAI’s GPT-3 models or the open-source alternatives available via Hugging Face.

As mentioned, LangChain can do much more than we’ve demonstrated here. We’ll be covering these other features in upcoming articles.

---

## References

[1] [GPT-3 Archived Repo](https://github.com/openai/gpt-3) (2020), OpenAI GitHub