Supported models
Enable a model from the Model APIs page in the Baseten dashboard.| Model | Slug | Context | Max output |
|---|---|---|---|
| DeepSeek V3 0324 | deepseek-ai/DeepSeek-V3-0324 | 164k | 131k |
| DeepSeek V3.1 | deepseek-ai/DeepSeek-V3.1 | 164k | 131k |
| GLM 4.6 | zai-org/GLM-4.6 | 200k | 200k |
| GLM 4.7 | zai-org/GLM-4.7 | 200k | 200k |
| GLM 5 | zai-org/GLM-5 | 203k | 203k |
| Kimi K2.5 | moonshotai/Kimi-K2.5 | 262k | 262k |
| Minimax M2.5 | MiniMaxAI/MiniMax-M2.5 | 204k | 204k |
| OpenAI GPT OSS 120B | openai/gpt-oss-120b | 128k | 128k |
Pricing
Pricing is per million tokens.| Model | Input | Output |
|---|---|---|
| OpenAI GPT OSS 120B | $0.10 | $0.50 |
| Minimax M2.5 | $0.30 | $1.20 |
| DeepSeek V3.1 | $0.50 | $1.50 |
| GLM 4.6 | $0.60 | $2.20 |
| GLM 4.7 | $0.60 | $2.20 |
| Kimi K2.5 | $0.60 | $3.00 |
| DeepSeek V3 0324 | $0.77 | $0.77 |
| GLM 5 | $0.95 | $3.15 |
/v1/models endpoint for current pricing.
Feature support
All models support tool calling. Support for other features varies by model. See Reasoning for configuration details.| Model | JSON mode | Structured outputs | Reasoning | Vision |
|---|---|---|---|---|
| DeepSeek V3 0324 | Yes | Yes | Enabled by default | No |
| DeepSeek V3.1 | No | No | Enabled by default | No |
| GLM 4.6 | Yes | Yes | Opt-in | No |
| GLM 4.7 | Yes | Yes | Opt-in | No |
| GLM 5 | Yes | Yes | No | No |
| Kimi K2.5 | Yes | Yes | Opt-in | Yes |
| Minimax M2.5 | Yes | Yes | Enabled by default | No |
| OpenAI GPT OSS 120B | Yes | Yes | Enabled by default | No |
top_p and top_k sampling parameters.
Create a chat completion
If you’ve already completed the quickstart, you have a working client. The examples below show a multi-turn conversation with a system message, which you can adapt for your application.- Python
- JavaScript
- cURL
Features
Model APIs are compatible with the OpenAI Chat Completions API. Available features include structured outputs, tool calling, reasoning, vision, and streaming (stream: true). Not all models support every feature. See feature support for per-model availability.
For the complete parameter reference, see the Chat Completions API documentation.
List available models
Query the/v1/models endpoint for the current list of models with metadata including pricing, context lengths, and supported features.
Migrate from OpenAI
To migrate existing OpenAI code to Baseten, change three values:- Replace your API key with a Baseten API key.
- Change the base URL to
https://inference.baseten.co/v1. - Update the model name to a Baseten model slug.
Handle errors
Model APIs return standard HTTP error codes:| Code | Meaning |
|---|---|
| 400 | Invalid request (check your parameters) |
| 401 | Invalid or missing API key |
| 402 | Payment required |
| 404 | Model not found |
| 429 | Rate limit exceeded |
| 500 | Internal server error |