Get model deployment metrics
Gets the metrics for a model deployment in the given time range.
Authorizations
Pass your Baseten API key. Clients automatically send Authorization: Bearer <key>. Direct callers can also use Authorization: Api-Key <key>; both schemes are accepted.
Query Parameters
'CURRENT': a single instantaneous snapshot at now; start/end must be omitted. 'SUMMARY': a single value set aggregating the whole window. 'SERIES': evenly-spaced value sets across the window, with the step derived from the window duration. How metric values are aggregated over the request.
CURRENT, SUMMARY, SERIES Epoch millis timestamp to start fetching metrics. Defaults to one hour before the end.
Epoch millis timestamp to end fetching metrics. Defaults to the current time. The window between start and end must not exceed 7 days.
Names of the metrics to return; see https://docs.baseten.co/observability/export-metrics/supported-metrics for the available names. When omitted, a default set is returned: baseten_replicas_active, baseten_inference_requests_total, and baseten_end_to_end_response_time_seconds. Unknown names are rejected; valid names that do not apply to the deployment are omitted from the response.
Response
Deployment metrics over a time window, index-mapped: metric descriptors
appear once in metric_descriptors; each value set's values are aligned
to that order.
Start of the returned window.
End of the returned window.
The aggregation mode used.
CURRENT, SUMMARY, SERIES Seconds per step; populated only in SERIES mode, null otherwise.
Descriptors for each metric; position defines the values index.
Metric values per time step covering the window. In summary mode this always contains exactly one value set spanning the whole window.