Google Generative AI Embeddings
Connect to Google’s generative AI embeddings service using the
GoogleGenerativeAIEmbeddings
class, found in the
langchain-google-genai
package.
Installation
%pip install -U langchain-google-genai
Credentials
import getpass
import os
if "GOOGLE_API_KEY" not in os.environ:
os.environ["GOOGLE_API_KEY"] = getpass("Provide your Google API key here")
Usage
from langchain_google_genai import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector = embeddings.embed_query("hello, world!")
vector[:5]
[0.05636945, 0.0048285457, -0.0762591, -0.023642512, 0.05329321]
Batch
You can also embed multiple strings at once for a processing speedup:
vectors = embeddings.embed_documents(
[
"Today is Monday",
"Today is Tuesday",
"Today is April Fools day",
]
)
len(vectors), len(vectors[0])
(3, 768)
Task type
GoogleGenerativeAIEmbeddings
optionally support a task_type
, which
currently must be one of:
- task_type_unspecified
- retrieval_query
- retrieval_document
- semantic_similarity
- classification
- clustering
By default, we use retrieval_document
in the embed_documents
method
and retrieval_query
in the embed_query
method. If you provide a task
type, we will use that for all methods.
%pip install --quiet matplotlib scikit-learn
Note: you may need to restart the kernel to use updated packages.
query_embeddings = GoogleGenerativeAIEmbeddings(
model="models/embedding-001", task_type="retrieval_query"
)
doc_embeddings = GoogleGenerativeAIEmbeddings(
model="models/embedding-001", task_type="retrieval_document"
)
All of these will be embedded with the ‘retrieval_query’ task set
query_vecs = [query_embeddings.embed_query(q) for q in [query, query_2, answer_1]]
All of these will be embedded with the ‘retrieval_document’ task set
doc_vecs = [doc_embeddings.embed_query(q) for q in [query, query_2, answer_1]]
In retrieval, relative distance matters. In the image above, you can see the difference in similarity scores between the “relevant doc” and “simil stronger delta between the similar query and relevant doc on the latter case.