ChatOllama
Ollama allows you to run open-source large language models, such as LLaMA2, locally.
Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.
It optimizes setup and configuration details, including GPU usage.
For a complete list of supported models and model variants, see the Ollama model library.
Setup
First, follow these instructions to set up and run a local Ollama instance:
- Download
- Fetch a model via
ollama pull <model family>
- e.g., for
Llama-7b
:ollama pull llama2
- This will download the most basic version of the model (e.g., minimum # parameters and 4-bit quantization)
- On Mac, it will download to:
~/.ollama/models/manifests/registry.ollama.ai/library/<model family>/latest
- And we can specify a particular version, e.g., for
ollama pull vicuna:13b-v1.5-16k-q4_0
- The file is here with the model version in place of
latest
~/.ollama/models/manifests/registry.ollama.ai/library/vicuna/13b-v1.5-16k-q4_0
You can easily access models in a few ways:
1/ if the app is running:
- All of your local models are automatically served on
localhost:11434
- Select your model when setting
llm = Ollama(..., model="<model family>:<version>")
- If you set
llm = Ollama(..., model="<model family")
withoout a version it will simply look forlatest
2/ if building from source or just running the binary:
- Then you must run
ollama serve
- All of your local models are automatically served on
localhost:11434
- Then, select as shown above
Usage
You can see a full list of supported parameters on the API reference page.
If you are using a LLaMA chat
model (e.g.,
ollama pull llama2:7b-chat
) then you can use the ChatOllama
interface.
This includes special tokens for system message and user input.
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chat_models import ChatOllama
chat_model = ChatOllama(
model="llama2:7b-chat",
)
Optionally, pass StreamingStdOutCallbackHandler
to stream tokens:
chat_model = ChatOllama(
model="llama2:7b-chat",
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
)
from langchain.schema import HumanMessage
messages = [HumanMessage(content="Tell me about the history of AI")]
chat_model(messages)
AIMessage(content='\nArtificial intelligence (AI) has a rich and diverse history that spans several decades. Here is a brief overview of the major milestones and events in the development of AI:\n\n1. 1950s: The Dartmouth Conference: The field of AI was officially launched at a conference held at Dartmouth College in 1956. Attendees included computer scientists, mathematicians, and cognitive scientists who were interested in exploring the possibilities of creating machines that could simulate human intelligence.\n2. 1951: The Turing Test: British mathematician Alan Turing proposed a test to measure a machine\'s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. The Turing Test has since become a benchmark for measuring the success of AI systems.\n3. 1956: The First AI Program: Computer scientist John McCarthy created the first AI program, called the Logical Theorist, which was designed to reason and solve problems using logical deduction.\n4. 1960s: Rule-Based Expert Systems: The development of rule-based expert systems, which used a set of rules to reason and make decisions, marked a significant milestone in the history of AI. These systems were widely used in industries such as banking, healthcare, and transportation.\n5. 1970s: Machine Learning: Machine learning, which enables machines to learn from data without being explicitly programmed, emerged as a major area of research in AI. This led to the development of algorithms such as decision trees and neural networks.\n6. 1980s: Expert Systems: The development of expert systems, which were designed to mimic the decision-making abilities of human experts, reached its peak in the 1980s. These systems were widely used in industries such as banking and healthcare.\n7. 1990s: AI Winter: Despite the progress that had been made in AI research, the field experienced a decline in funding and interest in the 1990s, which became known as the "AI winter."\n8. 2000s: Machine Learning Resurgence: The resurgence of machine learning, driven by advances in computational power and data storage, led to a new wave of AI research and applications.\n9. 2010s: Deep Learning: The development of deep learning algorithms, which are capable of learning complex patterns in large datasets, marked a significant breakthrough in AI research. These algorithms have been used in applications such as image and speech recognition, natural language processing, and autonomous vehicles.\n10. Present Day: AI is now being applied to a wide range of industries and domains, including healthcare, finance, transportation, and education. The field is continuing to evolve, with new technologies and applications emerging all the time.\n\nOverall, the history of AI reflects a long-standing interest in creating machines that can simulate human intelligence. While the field has experienced periods of progress and setbacks, it continues to evolve and expand into new areas of research and application.')
Extraction
Update your version of Ollama and supply the
format
flag.
We can enforce the model to produce JSON.
Note: You can also try out the experimental OllamaFunctions wrapper for convenience.
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chat_models import ChatOllama
chat_model = ChatOllama(
model="llama2",
format="json",
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
)
from langchain.schema import HumanMessage
messages = [
HumanMessage(
content="What color is the sky at different times of the day? Respond using JSON"
)
]
chat_model_response = chat_model(messages)
{"morning": {"sky": "pink", "sun": "rise"}, "daytime": {"sky": "blue", "sun": "high"}, "afternoon": {"sky": "gray", "sun": "peak"}, "evening": {"sky": "orange", "sun": "set"}}
import json
from langchain.schema import HumanMessage
json_schema = {
"title": "Person",
"description": "Identifying information about a person.",
"type": "object",
"properties": {
"name": {"title": "Name", "description": "The person's name", "type": "string"},
"age": {"title": "Age", "description": "The person's age", "type": "integer"},
"fav_food": {
"title": "Fav Food",
"description": "The person's favorite food",
"type": "string",
},
},
"required": ["name", "age"],
}
messages = [
HumanMessage(
content="Please tell me about a person using the following JSON schema:"
),
HumanMessage(content=json.dumps(json_schema, indent=2)),
HumanMessage(
content="Now, considering the schema, tell me about a person named John who is 35 years old and loves pizza."
),
]
chat_model_response = chat_model(messages)
{
"name": "John",
"age": 35,
"fav_food": "pizza"
}
Multi-modal
Ollama has support for multi-modal LLMs, such as bakllava and llava.
Browse the full set of versions for models with tags
, such as
here.
Download the desired LLM:
ollama pull bakllava
Be sure to update Ollama so that you have the most recent version to support multi-modal.
%pip install pillow
import base64
from io import BytesIO
from IPython.display import HTML, display
from PIL import Image
def convert_to_base64(pil_image):
"""
Convert PIL images to Base64 encoded strings
:param pil_image: PIL image
:return: Re-sized Base64 string
"""
buffered = BytesIO()
pil_image.save(buffered, format="JPEG") # You can change the format if needed
img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
return img_str
def plt_img_base64(img_base64):
"""
Disply base64 encoded string as image
:param img_base64: Base64 string
"""
# Create an HTML img tag with the base64 string as the source
image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
# Display the image by rendering the HTML
display(HTML(image_html))
file_path = "/Users/rlm/Desktop/Eval_Sets/multi_modal_presentations/DDOG/img_23.jpg"
pil_image = Image.open(file_path)
image_b64 = convert_to_base64(pil_image)
plt_img_base64(image_b64)
from langchain.chat_models import ChatOllama
from langchain_core.messages import HumanMessage
chat_model = ChatOllama(
model="bakllava",
)
# Call the chat model with both messages and images
content_parts = []
image_part = {
"type": "image_url",
"image_url": f"data:image/jpeg;base64,{image_b64}",
}
text_part = {"type": "text", "text": "What is the Daollar-based gross retention rate?"}
content_parts.append(image_part)
content_parts.append(text_part)
prompt = [HumanMessage(content=content_parts)]
chat_model(prompt)
AIMessage(content='90%')