API reference
Details on model inference and management APIs
Baseten provides two sets of API endpoints:
- An inference API for calling deployed models
- A management API for managing your models and workspace
Many inference and management API endpoints have different routes for the three types of deployments — development
, production
, and individual published deployments — which are listed separately in the sidebar.
Inference API
Each model deployed on Baseten has its own subdomain on api.baseten.co
to enable faster routing. This subdomain is used for inference endpoints, which are formatted as follows:
https://model-{model_id}.api.baseten.co/{deployment_type_or_id}/{endpoint}
Where:
model_id
is the alphanumeric ID of the model, which you can find in your model dashboard.deployment_type_or_id
is one ofdevelopment
,production
, or a separate alphanumeric ID for a specific published deployment of the model.endpoint
is a supported endpoint such aspredict
that you want to call.
The inference API also supports asynchronous inference for long-running tasks and priority queuing.
Management API
Management API endpoints all run through the base api.baseten.co
subdomain. Use management API endpoints for monitoring, CI/CD, and building both model-level and workspace-level automations.