Deploy Nomic Embed v1.5

Example usage

Nomic Embed v1.5 is a state of the art text embedding model with two special features:

  • You can choose whether to optimize the embeddings for retrieval, search, clustering, or classification.
  • You can trade off between cost and accuracy by choosing your own dimensionality thanks to Matryoshka Representation Learning.

Nomic Embed v1.5 takes the following parameters:

  • texts the strings to embed.
  • task_type the task to optimize the embedding for. Can be search_document (default), search_query, clustering, or classification.
  • dimensionality the size of each output vector, any integer between 64 and 768 (default).

This code sample demonstrates embedding a set of sentences for retrieval with a dimensionality of 512.

import requests
import os

# Replace the empty string with your model id below
model_id = ""
baseten_api_key = os.environ["BASETEN_API_KEY"]

data = {
    "texts": ["I want to eat pasta", "I want to eat pizza"],
    "task_type": "search_document",
    "dimensionality": 512
}

# Call model endpoint
res = requests.post(
    f"https://model-{model_id}.api.baseten.co/production/predict",
    headers={"Authorization": f"Api-Key {baseten_api_key}"},
    json=data
)

# Print the output of the model
print(res.json())

JSON output

[
  [-0.03811980411410332, "...", -0.023593541234731674],
  [-0.042617011815309525, "...", -0.0191882885992527]
]