Inference
Call your model
Run inference on deployed models
Once deployed, your model is accessible via an API endpoint. To make an inference request, you’ll need:
- Model ID
- An API key for your Baseten account.
- JSON-serializable model input
API Endpoints
Baseten provides multiple endpoints for different inference modes:
/predict
– Standard synchronous inference./async_predict
– Asynchronous inference for long-running tasks.
Endpoints are available for environments and all deployments. See the API reference for details.
Alternative Invocation Methods
- Truss CLI:
truss predict
- Model Dashboard: “Playground” button in the Baseten UI