Skip to main content
Asynchronously call a named environment of a model.
curl --request POST \
  --url https://model-{model_id}.api.baseten.co/environments/{env_name}/async_predict \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_input": {},
  "webhook_endpoint": "<string>",
  "priority": 0,
  "max_time_in_queue_seconds": 600,
  "inference_retry_config": {
    "max_attempts": 3,
    "initial_delay_ms": 1000,
    "max_delay_ms": 5000
  }
}
'
{
  "request_id": "<string>"
}

Authorizations

Authorization
string
header
required

API key with the Api-Key prefix, e.g. Authorization: Api-Key abcd1234.abcd1234.

Path Parameters

env_name
string
required

The name of the environment (e.g. production, staging).

Body

application/json

There is a 256 KiB size limit on async predict request payloads.

model_input
object
required

JSON-serializable model input.

webhook_endpoint
string<uri>

HTTPS URL to receive the prediction result via webhook. Both HTTP/2 and HTTP/1.1 are supported. If omitted, the model must save outputs so they can be accessed later.

priority
integer
default:0

Priority of the request. Lower values are higher priority.

Required range: 0 <= x <= 2
max_time_in_queue_seconds
integer
default:600

Maximum time in seconds a request will spend in the queue before expiring. Must be between 10 seconds and 72 hours.

Required range: 10 <= x <= 259200
inference_retry_config
object

Exponential backoff parameters for retrying predict requests.

Response

Async predict request enqueued.

request_id
string
required

The ID of the async request.