Using Model APIs on Baseten
Use Baseten’s OpenAI-compatible Model APIs for LLMs, including structured outputs and tool calling.
Baseten provides OpenAI-compatible API endpoints for all available Model APIs. This means you can use standard OpenAI client libraries—no wrappers, no rewrites, no surprises. If your code already works with OpenAI, it’ll work with Baseten.
This guide walks you through getting started, making your first call, and using advanced features like structured outputs and tool calling.
Prerequisites
Before you begin, make sure you have:
- A Baseten account
- An API key
- The OpenAI client library for your language of choice
Supported models
Baseten currently offers several high-performing open-source LLMs as Models APIs:
- Deepseek R1 (slug:
deepseek-ai/DeepSeek-R1
) - Deepseek V3 (slug:
deepseek-ai/DeepSeek-V3-0324
) - Llama 4 Maverick (slug:
meta-llama/Llama-4-Maverick-17B-128E-Instruct
) - Llama 4 Scout (slug:
meta-llama/Llama-4-Scout-17B-16E-Instruct
) - Qwen 3 🔜
Please update the model
in the examples below to the slug of the model you’d like to test.
Make your first API call
Request parameters
Model APIs support all commonly used OpenAI ChatCompletions parameters, including:
model
: Slug of the model you want to call (see below)messages
: Array of message objects (role
+content
)temperature
: Controls randomness (0-2, default 1)max_tokens
: Maximum number of tokens to generatestream
: Boolean to enable streaming responses
Structured outputs
To get structured JSON output from the model, you can use the response_format
parameter. Set response_format={"type": "json_object"}
to enable JSON mode. For more complex schemas, you can define a JSON schema.
Let’s say you want to extract specific information from a user’s query, like a name and an email address.
When strict: true
is specified within the json_schema
, the model is constrained to produce output that strictly adheres to the provided schema. If the model cannot or will not produce output that matches the schema, it may return an error or a refusal.
Tool calling
Model compatibility note: We recommend using Deepseek V3 for tool calling functionality. We do not recommend using Deepseek R1 for tool calling as the model was not post-trained for tool calling.
Tool calling is fully supported. Simply define a list of tools and pass them via the tools
parameter:
type
: The type of tool to call. Currently, the only supported value isfunction
.function
: A dictionary with the following keys:name
: The name of the function to be calleddescription
: A description of what the function doesparameters
: A JSON Schema object describing the function parameters
Here’s how you might implement tool calling:
Error Handling
The API returns standard HTTP error codes:
400
: Bad request (malformed input)401
: Unauthorized (invalid or missing API key)402
: Payment required404
: Model not found429
: Rate limit exceeded500
: Internal server error
Check the response body for specific error details and suggested resolutions.
Migrating from OpenAI
To migrate from OpenAI to Baseten’s OpenAI-compatible API, you need to make these changes to your existing code:
- Replace your OpenAI API key with your Baseten API key
- Change the base URL to
https://inference.baseten.co/v1.
- Update model names to match Baseten-supported slugs.