Get started
LCEL makes it easy to build complex chains from basic components, and supports out of the box functionality such as streaming, parallelism, and logging.
Basic example: prompt + model + output parser
The most basic and common use case is chaining a prompt template and a model together. To see how this works, let’s create a chain that takes a topic and generates a joke:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
prompt = ChatPromptTemplate.from_template("tell me a short joke about {topic}")
model = ChatOpenAI()
output_parser = StrOutputParser()
chain = prompt | model | output_parser
chain.invoke({"topic": "ice cream"})
"Why did the ice cream go to therapy?\n\nBecause it had too many toppings and couldn't find its cone-fidence!"
Notice this line of this code, where we piece together then different components into a single chain using LCEL:
chain = prompt | model | output_parser
The |
symbol is similar to a unix pipe
operator, which chains
together the different components feeds the output from one component as
input into the next component.
In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model, then the model output is passed to the output parser. Let’s take a look at each component individually to really understand what’s going on.
1. Prompt
prompt
is a BasePromptTemplate
, which means it takes in a dictionary
of template variables and produces a PromptValue
. A PromptValue
is a
wrapper around a completed prompt that can be passed to either an LLM
(which takes a string as input) or ChatModel
(which takes a sequence
of messages as input). It can work with either language model type
because it defines logic both for producing BaseMessage
s and for
producing a string.
prompt_value = prompt.invoke({"topic": "ice cream"})
prompt_value
ChatPromptValue(messages=[HumanMessage(content='tell me a short joke about ice cream')])
prompt_value.to_messages()
[HumanMessage(content='tell me a short joke about ice cream')]
prompt_value.to_string()
'Human: tell me a short joke about ice cream'
2. Model
The PromptValue
is then passed to model
. In this case our model
is
a ChatModel
, meaning it will output a BaseMessage
.
message = model.invoke(prompt_value)
message
AIMessage(content="Why did the ice cream go to therapy? \n\nBecause it had too many toppings and couldn't find its cone-fidence!")
If our model
was an LLM
, it would output a string.
from langchain.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo-instruct")
llm.invoke(prompt_value)
'\n\nRobot: Why did the ice cream go to therapy? Because it had a rocky road.'
3. Output parser
And lastly we pass our model
output to the output_parser
, which is a
BaseOutputParser
meaning it takes either a string or a BaseMessage
as input. The StrOutputParser
specifically simple converts any input
into a string.
output_parser.invoke(message)
"Why did the ice cream go to therapy? \n\nBecause it had too many toppings and couldn't find its cone-fidence!"
4. Entire Pipeline
To follow the steps along:
- We pass in user input on the desired topic as
{"topic": "ice cream"}
- The
prompt
component takes the user input, which is then used to construct a PromptValue after using thetopic
to construct the prompt. - The
model
component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is aChatMessage
object. - Finally, the
output_parser
component takes in aChatMessage
, and transforms this into a Python string, which is returned from the invoke method.
Note that if you’re curious about the output of any components, you can
always test out a smaller version of the chain such as prompt
or
prompt | model
to see the intermediate results:
input = {"topic": "ice cream"}
prompt.invoke(input)
# > ChatPromptValue(messages=[HumanMessage(content='tell me a short joke about ice cream')])
(prompt | model).invoke(input)
# > AIMessage(content="Why did the ice cream go to therapy?\nBecause it had too many toppings and couldn't cone-trol itself!")
RAG Search Example
For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions.
# Requires:
# pip install langchain docarray tiktoken
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import ChatPromptTemplate
from langchain.vectorstores import DocArrayInMemorySearch
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
vectorstore = DocArrayInMemorySearch.from_texts(
["harrison worked at kensho", "bears like to eat honey"],
embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()
output_parser = StrOutputParser()
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser
chain.invoke("where did harrison work?")
In this case, the composed chain is:
chain = setup_and_retrieval | prompt | model | output_parser
To explain this, we first can see that the prompt template above takes
in context
and question
as values to be substituted in the prompt.
Before building the prompt template, we want to retrieve relevant
documents to the search and include them as part of the context.
As a preliminary step, we’ve setup the retriever using an in memory store, which can retrieve documents based on a query. This is a runnable component as well that can be chained together with other components, but you can also try to run it separately:
retriever.invoke("where did harrison work?")
We then use the RunnableParallel
to prepare the expected inputs into
the prompt by using the entries for the retrieved documents as well as
the original user question, using the retriever for document search, and
RunnablePassthrough to pass the user’s question:
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
To review, the complete chain is:
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser
With the flow being:
- The first steps create a
RunnableParallel
object with two entries. The first entry,context
will include the document results fetched by the retriever. The second entry,question
will contain the user’s original question. To pass on the question, we useRunnablePassthrough
to copy this entry. - Feed the dictionary from the step above to the
prompt
component. It then takes the user input which isquestion
as well as the retrieved document which iscontext
to construct a prompt and output a PromptValue. - The
model
component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is aChatMessage
object. - Finally, the
output_parser
component takes in aChatMessage
, and transforms this into a Python string, which is returned from the invoke method.
Next steps
We recommend reading our Why use LCEL section next to see a side-by-side comparison of the code needed to produce common functionality with and without LCEL.