Prerequisites
Before you begin, make sure you have:- A Baseten account
- An API key
- The OpenAI client library for your language of choice
Supported models
Baseten currently offers several high-performing open-source LLMs as Models APIs:- OpenAI GPT OSS 120B (slug:
openai/gpt-oss-120b
) - Qwen3 235B 2507 (slug:
Qwen/Qwen3-235B-A22B-Instruct-2507
) - Qwen3 Coder 480B (slug:
Qwen/Qwen3-Coder-480B-A35B-Instruct
) - Kimi K2 0905 (slug:
moonshotai/Kimi-K2-Instruct-0905
) - Kimi K2 0711 (slug:
moonshotai/Kimi-K2-Instruct
) - Deepseek V3.1 (slug:
deepseek-ai/DeepSeek-V3.1
) - Deepseek R1 0528 (slug:
deepseek-ai/DeepSeek-R1-0528
) - Deepseek V3 0324 (slug:
deepseek-ai/DeepSeek-V3-0324
) - Llama 4 Maverick (slug:
meta-llama/Llama-4-Maverick-17B-128E-Instruct
) - Llama 4 Scout (slug:
meta-llama/Llama-4-Scout-17B-16E-Instruct
)
model
in the examples below to the slug of the model you’d like to test.
Make your first API call
Request parameters
Model APIs support all commonly used OpenAI ChatCompletions parameters, including:model
: Slug of the model you want to call (see below)messages
: Array of message objects (role
+content
)temperature
: Controls randomness (0-2, default 1)max_tokens
: Maximum number of tokens to generatestream
: Boolean to enable streaming responses
Structured outputs
To get structured JSON output from the model, you can use theresponse_format
parameter. Set response_format={"type": "json_object"}
to enable JSON mode. For more complex schemas, you can define a JSON schema.
Let’s say you want to extract specific information from a user’s query, like a name and an email address.
strict: true
is specified within the json_schema
, the model is constrained to produce output that strictly adheres to the provided schema. If the model cannot or will not produce output that matches the schema, it may return an error or a refusal.
Tool calling
Model compatibility note: We recommend using Deepseek V3 for tool calling functionality. We do not recommend using Deepseek R1 for tool calling as the model was not post-trained for tool calling.
tools
parameter:
type
: The type of tool to call. Currently, the only supported value isfunction
.function
: A dictionary with the following keys:name
: The name of the function to be calleddescription
: A description of what the function doesparameters
: A JSON Schema object describing the function parameters
Error Handling
The API returns standard HTTP error codes:400
: Bad request (malformed input)401
: Unauthorized (invalid or missing API key)402
: Payment required404
: Model not found429
: Rate limit exceeded500
: Internal server error
Migrating from OpenAI
To migrate from OpenAI to Baseten’s OpenAI-compatible API, you need to make these changes to your existing code:- Replace your OpenAI API key with your Baseten API key
- Change the base URL to
https://inference.baseten.co/v1.
- Update model names to match Baseten-supported slugs.