Skip to main content
GET
/
v1
/
models
/
{model_id}
/
deployments
/
{deployment_id}
/
metrics
cURL
curl --request GET \
--url https://api.baseten.co/v1/models/{model_id}/deployments/{deployment_id}/metrics \
--header "Authorization: Bearer $BASETEN_API_KEY"
{
  "start_epoch_millis": 123,
  "end_epoch_millis": 123,
  "step_seconds": 123,
  "metric_descriptors": [
    {
      "name": "<string>",
      "label_sets": [
        {}
      ]
    }
  ],
  "metric_values": [
    {
      "start_epoch_millis": 123,
      "values": [
        [
          123
        ]
      ]
    }
  ]
}

Authorizations

Authorization
string
header
required

Pass your Baseten API key. Clients automatically send Authorization: Bearer <key>. Direct callers can also use Authorization: Api-Key <key>; both schemes are accepted.

Path Parameters

model_id
string
required
deployment_id
string
required

Query Parameters

mode
enum<string>
default:CURRENT

'CURRENT': a single instantaneous snapshot at now; start/end must be omitted. 'SUMMARY': a single value set aggregating the whole window. 'SERIES': evenly-spaced value sets across the window, with the step derived from the window duration. How metric values are aggregated over the request.

Available options:
CURRENT,
SUMMARY,
SERIES
start_epoch_millis
integer | null

Epoch millis timestamp to start fetching metrics. Defaults to one hour before the end.

end_epoch_millis
integer | null

Epoch millis timestamp to end fetching metrics. Defaults to the current time. The window between start and end must not exceed 7 days.

metrics
string[]

Names of the metrics to return; see https://docs.baseten.co/observability/export-metrics/supported-metrics for the available names. When omitted, a default set is returned: baseten_replicas_active, baseten_inference_requests_total, and baseten_end_to_end_response_time_seconds. Unknown names are rejected; valid names that do not apply to the deployment are omitted from the response.

Response

200 - application/json

Deployment metrics over a time window, index-mapped: metric descriptors appear once in metric_descriptors; each value set's values are aligned to that order.

start_epoch_millis
integer
required

Start of the returned window.

end_epoch_millis
integer
required

End of the returned window.

mode
enum<string>
required

The aggregation mode used.

Available options:
CURRENT,
SUMMARY,
SERIES
step_seconds
integer | null
required

Seconds per step; populated only in SERIES mode, null otherwise.

metric_descriptors
DeploymentMetricDescriptorV1 · object[]
required

Descriptors for each metric; position defines the values index.

metric_values
DeploymentMetricValueSetV1 · object[]
required

Metric values per time step covering the window. In summary mode this always contains exactly one value set spanning the whole window.