curl --request PATCH \
--url https://api.baseten.co/v1/models/{model_id}/deployments/{deployment_id}/autoscaling_settings \
--header "Authorization: Api-Key $BASETEN_API_KEY" \
--data '{
  "min_replica": 0,
  "max_replica": 7,
  "autoscaling_window": 600,
  "scale_down_delay": 120,
  "concurrency_target": 2
}'
{
  "status": "ACCEPTED",
  "message": "<string>"
}

Authorizations

Authorization
string
header
required

You must specify the scheme 'Api-Key' in the Authorization header. For example, Authorization: Api-Key <Your_Api_Key>

Path Parameters

model_id
string
required
deployment_id
string
required

Body

application/json

A request to update autoscaling settings for a deployment. All fields are optional, and we only update ones passed in.

min_replica
integer | null

Minimum number of replicas

Example:

0

max_replica
integer | null

Maximum number of replicas

Example:

7

autoscaling_window
integer | null

Timeframe of traffic considered for autoscaling decisions

Example:

600

scale_down_delay
integer | null

Waiting period before scaling down any active replica

Example:

120

concurrency_target
integer | null

Number of requests per replica before scaling up

Example:

2

Response

200 - application/json

The response to a request to update autoscaling settings.

status
enum<string>
required

Status of the request to update autoscaling settings

Available options:
ACCEPTED,
QUEUED,
UNCHANGED
message
string
required

A message describing the status of the request to update autoscaling settings