Published deployment
Use this endpoint to get the status of a published deployment’s async queue.
Parameters
The ID of the model.
The ID of the deployment.
Headers
Your Baseten API key, formatted with prefix Api-Key
(e.g. {"Authorization": "Api-Key abcd1234.abcd1234"}
).
Response
The ID of the model.
The ID of the deployment.
The number of requests in the deployment’s async queue with QUEUED
status (i.e. awaiting processing by the model).
The number of requests in the deployment’s async queue with IN_PROGRESS
status (i.e. currently being processed by the model).
Rate limits
Calls to the /async_queue_status
endpoint are limited to 20 requests per second. If this limit is exceeded, subsequent requests will receive a 429 status code.
To gracefully handle hitting this rate limit, we advise implementing a backpressure mechanism, such as calling /async_queue_status
with exponential backoff in response to 429 errors.