vLLM Chat
vLLM can be deployed as a server that mimics the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API. This server can be queried in the same format as OpenAI API.
This notebook covers how to get started with vLLM chat models using
langchain’s ChatOpenAI
as it is.
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
inference_server_url = "http://localhost:8000/v1"
chat = ChatOpenAI(
model="mosaicml/mpt-7b",
openai_api_key="EMPTY",
openai_api_base=inference_server_url,
max_tokens=5,
temperature=0,
)
messages = [
SystemMessage(
content="You are a helpful assistant that translates English to Italian."
),
HumanMessage(
content="Translate the following sentence from English to Italian: I love programming."
),
]
chat(messages)
AIMessage(content=' Io amo programmare', additional_kwargs={}, example=False)
You can make use of templating by using a MessagePromptTemplate
. You
can build a ChatPromptTemplate
from one or more
MessagePromptTemplates
. You can use ChatPromptTemplate’s format_prompt
– this returns a PromptValue
, which you can convert to a string or
Message
object, depending on whether you want to use the formatted
value as input to an llm or chat model.
For convenience, there is a from_template
method exposed on the
template. If you were to use this template, this is what it would look
like:
template = (
"You are a helpful assistant that translates {input_language} to {output_language}."
)
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template = "{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages(
[system_message_prompt, human_message_prompt]
)
# get a chat completion from the formatted messages
chat(
chat_prompt.format_prompt(
input_language="English", output_language="Italian", text="I love programming."
).to_messages()
)
AIMessage(content=' I love programming too.', additional_kwargs={}, example=False)