- Model ID
- An API key for your Baseten account.
- JSON-serializable model input
Predict API endpoints
Baseten provides multiple endpoints for different inference modes:/predict
– Standard synchronous inference./async_predict
– Asynchronous inference for long-running tasks.
Sync API endpoints
Custom servers support bothpredict
endpoints as well as a special sync
endpoint. By using the sync
endpoint you are able to call different routes in your custom server.
https://model-{model_id}.../sync/health
->/health
https://model-{model_id}.../sync/items
->/items
https://model-{model_id}.../sync/items/123
->/items/123
OpenAI SDK
When deploying a model with Engine Builder, you will get an OpenAI compatible server. If you are already using one of the OpenAI SDKs, you will simply need to update the base url to your Baseten model URL and include your Baseten API Key.Alternative invocation methods
- Truss CLI:
truss predict
- Model Dashboard: “Playground” button in the Baseten UI