Parameters

model_id
string
required

The ID of the model.

deployment_id
string
required

The ID of the deployment.

Headers

Authorization
string
required

Your Baseten API key, formatted with prefix Api-Key (e.g. {"Authorization": "Api-Key abcd1234.abcd1234"}).

Response

model_id
string
required

The ID of the model.

deployment_id
string
required

The ID of the deployment.

num_queued_requests
integer
required

The number of requests in the deployment’s async queue with QUEUED status (i.e. awaiting processing by the model).

num_in_progress_requests
integer
required

The number of requests in the deployment’s async queue with IN_PROGRESS status (i.e. currently being processed by the model).

Rate limits

Calls to the /async_queue_status endpoint are limited to 20 requests per second. If this limit is exceeded, subsequent requests will receive a 429 status code.

To gracefully handle hitting this rate limit, we advise implementing a backpressure mechanism, such as calling /async_queue_status with exponential backoff in response to 429 errors.