When a model is serving hundreds of requests per minute, a single slow or failing prediction can be difficult to isolate from the surrounding noise. Baseten addresses this by assigning a unique request ID to every predict call and returning it in theDocumentation Index
Fetch the complete documentation index at: https://docs.baseten.co/llms.txt
Use this file to discover all available pages before exploring further.
X-Baseten-Request-Id response header. Because each request carries its own ID, you can trace a single prediction through your model’s logs without sifting through unrelated entries.
Per-request log filtering requires Truss version 0.15.5 or later. Upgrade with
pip install --upgrade trussScope by environment or deployment
The Logs tab can show entries from a single deployment or from every deployment in an environment. Use the dropdowns at the top of the tab to switch. Environment scope aggregates logs across every deployment in that environment, including past deployments still serving traffic during a rollout. Use it to follow a request across deployment boundaries or to watch a promotion in progress. Deployment scope restricts logs to a single deployment ID. Use it to isolate behavior to one version, such as a development deployment. The same scope applies to live tail and historical search.Getting the request ID
The first step is capturing the request ID from the response. Baseten includes it in every predict response, regardless of whether the call is synchronous, asynchronous, or gRPC. The exact location depends on the protocol you’re using:- HTTP
- gRPC
- Async
When you make a predict call, include the The request ID appears as a response header:
-sD- flag to print response headers alongside the body:Filtering logs by request ID
Once you have a request ID, open the model’s logs page and enter it in the search filter bar using therequestId: prefix:
Logging with request context
For standard Truss models, Baseten automatically attaches the request ID to any log emitted via Python’slogging module during a predict call. No configuration is required — just use a logger:
Custom servers
For standard Truss models, Baseten handles request ID logging automatically through the framework’s built-in JSON formatter. No configuration is required. Custom servers don’t have this built-in support, so you’ll need to do two things: extract thex-baseten-request-id header from incoming requests, and include it as a top-level request_id key in your JSON log output. Both steps are covered in the setup guides for custom HTTP servers and custom gRPC servers.