- Model ID: Found in the Baseten dashboard or returned when you deploy.
- API key: Authenticates your requests.
- JSON-serializable model input: The data your model expects.
Authentication
Include your API key in theAuthorization header:
Request
predict.py
Baseten also accepts the legacy
Authorization: Api-Key <api_key> scheme on every endpoint, so existing scripts continue to work:Request
Predict API endpoints
Baseten provides multiple endpoints for different inference modes:/predict: Standard synchronous inference./async_predict: Asynchronous inference for long-running tasks.
Sync API endpoints
Custom servers support bothpredict endpoints and a special sync endpoint. Use the sync endpoint to call different routes in your custom server:
URL
https://model-{model_id}.../sync/health->/healthhttps://model-{model_id}.../sync/items->/itemshttps://model-{model_id}.../sync/items/123->/items/123
OpenAI SDK
When you deploy a model with Engine-Builder, you’ll get an OpenAI-compatible server. If you already use one of the OpenAI SDKs, update the base URL to your Baseten model URL and include your Baseten API key:openai_client.py
External LLM gateways
Any LLM gateway that speaks the OpenAI protocol, such as LiteLLM or OpenRouter, can route traffic to a Baseten deployment. Configure the gateway with three values:- Base URL:
https://model-{model_id}.api.baseten.co/environments/production/sync/v1, using the model ID for your deployment. Click API endpoint on the model page in the Baseten dashboard to copy the full URL. - Model name: The value of
--served-model-namefrom your deployment’sstart_command. See the vLLM example for where this is set. When a single gateway routes to several deployments, use anorg/modelnaming convention (for example,acme/llama-3-70b) to keep routing unambiguous. - API key: A Baseten API key with access to the deployment.
{base_url}/chat/completions with model set to the served model name and an Authorization: Bearer <key> header.
Alternative invocation methods
- Baseten CLI:
baseten model predict - Model Dashboard: “Playground” button in the Baseten UI