Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.baseten.co/llms.txt

Use this file to discover all available pages before exploring further.

When a model is serving hundreds of requests per minute, a single slow or failing prediction can be difficult to isolate from the surrounding noise. Baseten addresses this by assigning a unique request ID to every predict call and returning it in the X-Baseten-Request-Id response header. Because each request carries its own ID, you can trace a single prediction through your model’s logs without sifting through unrelated entries.
Per-request log filtering requires Truss version 0.15.5 or later. Upgrade with pip install --upgrade truss

Scope by environment or deployment

The Logs tab can show entries from a single deployment or from every deployment in an environment. Use the dropdowns at the top of the tab to switch. Environment scope aggregates logs across every deployment in that environment, including past deployments still serving traffic during a rollout. Use it to follow a request across deployment boundaries or to watch a promotion in progress. Deployment scope restricts logs to a single deployment ID. Use it to isolate behavior to one version, such as a development deployment. The same scope applies to live tail and historical search.

Getting the request ID

The first step is capturing the request ID from the response. Baseten includes it in every predict response, regardless of whether the call is synchronous, asynchronous, or gRPC. The exact location depends on the protocol you’re using:
When you make a predict call, include the -sD- flag to print response headers alongside the body:
curl -sD- -X POST "https://model-{MODEL_ID}.api.baseten.co/production/predict" \
  -H "Authorization: Api-Key $BASETEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello"}'
The request ID appears as a response header:
X-Baseten-Request-Id: 31255019cf83c4d0c7492a5006591e1f502a5

Filtering logs by request ID

Once you have a request ID, open the model’s logs page and enter it in the search filter bar using the requestId: prefix:
requestId:31255019cf83c4d0c7492a5006591e1f502a5
The view narrows to show only log entries from that request. Each log line also displays the request ID alongside the replica ID, so you can confirm you’re looking at the right trace even when scrolling through mixed output.

Logging with request context

For standard Truss models, Baseten automatically attaches the request ID to any log emitted via Python’s logging module during a predict call. No configuration is required — just use a logger:
import logging

logger = logging.getLogger(__name__)

class Model:
    def predict(self, request):
        logger.info("Starting prediction")  # request_id is added automatically
        ...

Custom servers

For standard Truss models, Baseten handles request ID logging automatically through the framework’s built-in JSON formatter. No configuration is required. Custom servers don’t have this built-in support, so you’ll need to do two things: extract the x-baseten-request-id header from incoming requests, and include it as a top-level request_id key in your JSON log output. Both steps are covered in the setup guides for custom HTTP servers and custom gRPC servers.