Payload
Baseten POSTs a JSON body to your configured webhook URL. Every payload uses the standard Baseten envelope, wheretype is the discriminator and data holds the event-specific fields. Frontier Gateway emits the API_BILLING_USAGE event type; future event types may share the same envelope.
The data.events array can contain one or more events per delivery. Each event corresponds to a single inference request.
Stable identifier for the event. Use this to deduplicate on your side.
ISO 8601 UTC timestamp of the inference request.
Per-request identifier, useful for correlating billing events with platform logs.
Freeform JSON object passed through from the inference request. May be
null when no metadata is supplied.The model slug invoked, in
your-org/your-model form.The
metadata.external_entity_id you set on the group that owns the key used for the request. The wire field name is externalCustomerId for historical reasons; the value is the same one you write under metadata.external_entity_id when you create or update the group.Prefix of the federated API key that made the request (the substring before the
. in the full key string). The group identifies your customer; the prefix identifies which of that customer’s keys drove the usage.Token counts for the request.
- inputTokens (
integer, required): Prompt tokens. - outputTokens (
integer, required): Generated tokens. - cachedInputTokens (
integer, required): Prompt tokens served from cache, when applicable.
Headers
Baseten sets two headers on every delivery:X-Baseten-Signature: HMAC signature of the raw request body. For more information, see Verify the signature.X-Baseten-Request-ID: UUID generated per outbound delivery. Log this on your receiver as a correlation ID for debugging against Baseten platform logs. UseidempotencyKey, not this header, to dedupe events on your side; the samerequestIdis reused across retry attempts of a single delivery.
Verify the signature
TheX-Baseten-Signature header has the format v1=<hex>, where <hex> is the HMAC-SHA256 of the raw request body computed with your workspace’s webhook signing secret. Verify the signature on every request before trusting the payload.
Two requirements:
- Verify against the raw bytes of the request body, not a re-serialized version. JSON re-serialization changes whitespace and field order and breaks the HMAC.
- Use a constant-time comparison (
hmac.compare_digestin Python,crypto.timingSafeEqualin Node.js) to avoid timing attacks.
- Python
- Node.js
verify.py
Delivery semantics
Baseten retries failed deliveries with exponential backoff so a transient blip on your endpoint doesn’t drop billing events. Use these numbers to size your endpoint SLOs and to know when a failure is terminal.- Per-attempt timeout: 10 seconds. If your endpoint doesn’t respond within this window, Baseten cancels the attempt and treats it as a failure.
- Backoff: Exponential, starting at 1 second between attempts and capping at 5 seconds.
- Maximum elapsed time: 15 seconds. After this, Baseten stops retrying and routes the event to a dead-letter queue. The retry window is tight: the realistic budget is one or two attempts.
- 4xx responses are terminal: Any 4xx status from your endpoint stops retries immediately. Only 5xx responses, network errors, and timeouts trigger a retry.
Recommended consumption pattern
Treat the webhook handler as an ingestion shim, not a billing pipeline. The handler’s job is to durably accept the event and return as fast as possible.- Verify the signature.
- Persist the event to your own queue or database, keyed on
idempotencyKey. - Return a 2xx response.
- Process and forward to your billing provider asynchronously.
Next steps
- Manage groups and API keys: Create groups, build a hierarchy, mint and revoke keys, and delete groups.
- Rate and usage limits: Cap per-group, per-model token and request volume.