Example usage
Nomic Embed v1.5 is a state of the art text embedding model with two special features:
- You can choose whether to optimize the embeddings for retrieval, search, clustering, or classification.
- You can trade off between cost and accuracy by choosing your own dimensionality thanks to Matryoshka Representation Learning.
Nomic Embed v1.5 takes the following parameters:
texts
the strings to embed.
task_type
the task to optimize the embedding for. Can be search_document
(default), search_query
, clustering
, or classification
.
dimensionality
the size of each output vector, any integer between 64
and 768
(default).
This code sample demonstrates embedding a set of sentences for retrieval with a dimensionality of 512.
import requests
import os
# Replace the empty string with your model id below
model_id = ""
baseten_api_key = os.environ["BASETEN_API_KEY"]
data = {
"texts": ["I want to eat pasta", "I want to eat pizza"],
"task_type": "search_document",
"dimensionality": 512
}
# Call model endpoint
res = requests.post(
f"https://model-{model_id}.api.baseten.co/production/predict",
headers={"Authorization": f"Api-Key {baseten_api_key}"},
json=data
)
# Print the output of the model
print(res.json())
JSON output
[
[-0.03811980411410332, "...", -0.023593541234731674],
[-0.042617011815309525, "...", -0.0191882885992527]
]
Responses are generated using AI and may contain mistakes.