- Truss docs
- Changelog
- Model library
- POSTProduction deployment
- POSTDevelopment deployment
- POSTPublished deployment
OpenAI compatible endpoints
Wake deployment endpoints
Deprecated endpoints
- POSTProduction deployment
- POSTDevelopment deployment
- POSTPublished deployment
- GETGet async request status
- DELCancel async request
- GETGet all secrets
- POSTUpsert a secret
- GETGet all models
- GETGet a model by ID
- GETGet all model deployments
Get model deployment
Update autoscaling settings
Promote deployment
Activate deployment
Deactivate deployment
Inference API
Async Inference API
Management API
Get all model deployments
curl --request GET \
--url https://api.baseten.co/v1/models/{model_id}/deployments \
--header "Authorization: Api-Key $BASETEN_API_KEY" \
{
"deployments": [
{
"id": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"name": "<string>",
"model_id": "<string>",
"is_production": true,
"is_development": true,
"status": "BUILDING",
"active_replica_count": 123,
"autoscaling_settings": {
"min_replica": 123,
"max_replica": 123,
"autoscaling_window": 123,
"scale_down_delay": 123,
"concurrency_target": 123
}
}
]
}
Authorizations
You must specify the scheme 'Api-Key' in the Authorization header. For example, Authorization: Api-Key <Your_Api_Key>
Path Parameters
Response
A list of deployments of a model
Unique identifier of the deployment
Time the deployment was created in ISO 8601 format
Name of the deployment
Unique identifier of the model
Whether the deployment is the production deployment of the model
Whether the deployment is the development deployment of the model
Status of the deployment
BUILDING
, DEPLOYING
, DEPLOY_FAILED
, LOADING_MODEL
, ACTIVE
, UNHEALTHY
, BUILD_FAILED
, BUILD_STOPPED
, DEACTIVATING
, INACTIVE
, FAILED
, UPDATING
, SCALED_TO_ZERO
, WAKING_UP
Number of active replicas
Autoscaling settings for the deployment
Minimum number of replicas
Maximum number of replicas
Timeframe of traffic considered for autoscaling decisions
Waiting period before scaling down any active replica
Number of requests per replica before scaling up
curl --request GET \
--url https://api.baseten.co/v1/models/{model_id}/deployments \
--header "Authorization: Api-Key $BASETEN_API_KEY" \
{
"deployments": [
{
"id": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"name": "<string>",
"model_id": "<string>",
"is_production": true,
"is_development": true,
"status": "BUILDING",
"active_replica_count": 123,
"autoscaling_settings": {
"min_replica": 123,
"max_replica": 123,
"autoscaling_window": 123,
"scale_down_delay": 123,
"concurrency_target": 123
}
}
]
}