When a model is serving hundreds of requests per minute, a single slow or failing prediction can be difficult to isolate from the surrounding noise. Baseten addresses this by assigning a unique request ID to every predict call and returning it in the X-Baseten-Request-Id response header. Because each request carries its own ID, you can trace a single prediction through your model’s logs without sifting through unrelated entries.
Per-request log filtering requires Truss version 0.15.5 or later. Upgrade with pip install --upgrade truss
Getting the request ID
The first step is capturing the request ID from the response. Baseten includes it in every predict response, regardless of whether the call is synchronous, asynchronous, or gRPC. The exact location depends on the protocol you’re using:
When you make a predict call, include the -sD- flag to print response headers alongside the body:curl -sD- -X POST "https://model-{MODEL_ID}.api.baseten.co/production/predict" \
-H "Authorization: Api-Key $BASETEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello"}'
The request ID appears as a response header:X-Baseten-Request-Id: 31255019cf83c4d0c7492a5006591e1f502a5
For gRPC calls, the request ID is in the response trailer metadata rather than an HTTP header. Use the -vv flag with grpcurl to surface it:grpcurl -vv \
-H "baseten-authorization: Api-Key $BASETEN_API_KEY" \
-H "baseten-model-id: model-{MODEL_ID}" \
-d '{"name": "World"}' \
model-{MODEL_ID}.grpc.api.baseten.co:443 \
example.Greeter/SayHello
Look for x-baseten-request-id in the trailer metadata at the end of the response:x-baseten-request-id: 31255019cf83c4d0c7492a5006591e1f502a5
Async predict calls return the request ID in two places: the response header and the JSON body. This makes it easy to capture programmatically without parsing headers:curl -sD- -X POST "https://model-{MODEL_ID}.api.baseten.co/production/async_predict" \
-H "Authorization: Api-Key $BASETEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello"}'
X-Baseten-Request-Id: 31255019cf83c4d0c7492a5006591e1f502a5
{"request_id": "31255019cf83c4d0c7492a5006591e1f502a5"}
Filtering logs by request ID
Once you have a request ID, open the model’s logs page and enter it in the search filter bar using the requestId: prefix:
requestId:31255019cf83c4d0c7492a5006591e1f502a5
The view narrows to show only log entries from that request. Each log line also displays the request ID alongside the replica ID, so you can confirm you’re looking at the right trace even when scrolling through mixed output.
Logging with request context
For standard Truss models, Baseten automatically attaches the request ID to any log emitted via Python’s logging module during a predict call. No configuration is required — just use a logger:
import logging
logger = logging.getLogger(__name__)
class Model:
def predict(self, request):
logger.info("Starting prediction") # request_id is added automatically
...
Custom servers
For standard Truss models, Baseten handles request ID logging automatically through the framework’s built-in JSON formatter. No configuration is required.
Custom servers don’t have this built-in support, so you’ll need to do two things: extract the x-baseten-request-id header from incoming requests, and include it as a top-level request_id key in your JSON log output. Both steps are covered in the setup guides for custom HTTP servers and custom gRPC servers.