Metrics support matrix
Which metrics can be exported
baseten_inference_requests_total
Cumulative number of requests to the model.
Type: counter
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The status code of the response.
Whether the request was an async inference request.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"
baseten_end_to_end_response_time_seconds
End-to-end response time in seconds.
Type: histogram
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The status code of the response.
Whether the request was an async inference request.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"
baseten_container_cpu_usage_seconds_total
Cumulative CPU time consumed by the container in core-seconds.
Type: counter
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The ID of the replica.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"
baseten_replicas_active
Number of replicas ready to serve model requests.
Type: gauge
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"
baseten_replicas_starting
Number of replicas starting up—i.e. either waiting for resources to be available or loading the model.
Type: gauge
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"
baseten_container_cpu_memory_working_set_bytes
Cumulative CPU time consumed by the container in seconds.
Type: gauge
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The ID of the replica.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"
baseten_gpu_memory_used
GPU memory used in MiB.
Type: gauge
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The ID of the replica.
The ID of the GPU.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"
baseten_gpu_utilization
GPU utilization as a percentage (between 0 and 100).
Type: gauge
Labels:
The ID of the model.
The name of the model.
The ID of the deployment.
The ID of the replica.
The ID of the GPU.
The environment that the deployment corresponds to. Empty if the deployment is not associated with an environment.
Possible values:
"production"
The phase of the deployment in the promote to production process. Empty if the deployment is not associated with an environment.
Possible values:
"promoting"
"stable"