Use Baseten’s OpenAI-compatible Model APIs for LLMs, including structured outputs and tool calling.
deepseek-ai/DeepSeek-R1-0528
)deepseek-ai/DeepSeek-V3-0324
)meta-llama/Llama-4-Maverick-17B-128E-Instruct
)meta-llama/Llama-4-Scout-17B-16E-Instruct
)model
in the examples below to the slug of the model you’d like to test.
model
: Slug of the model you want to call (see below)messages
: Array of message objects (role
+ content
)temperature
: Controls randomness (0-2, default 1)max_tokens
: Maximum number of tokens to generatestream
: Boolean to enable streaming responsesresponse_format
parameter. Set response_format={"type": "json_object"}
to enable JSON mode. For more complex schemas, you can define a JSON schema.
Let’s say you want to extract specific information from a user’s query, like a name and an email address.
strict: true
is specified within the json_schema
, the model is constrained to produce output that strictly adheres to the provided schema. If the model cannot or will not produce output that matches the schema, it may return an error or a refusal.
tools
parameter:
type
: The type of tool to call. Currently, the only supported value is function
.function
: A dictionary with the following keys:
name
: The name of the function to be calleddescription
: A description of what the function doesparameters
: A JSON Schema object describing the function parameters400
: Bad request (malformed input)401
: Unauthorized (invalid or missing API key)402
: Payment required404
: Model not found429
: Rate limit exceeded500
: Internal server errorhttps://inference.baseten.co/v1.