The truss server has OpenTelemetry (OTEL)
instrumentation builtin. Additionally, users can add their custom instrumentation.
Traces can be useful to investigate performance bottlenecks or other issues.
By default, tracing is not enabled as it can lead to some minor performance overhead
which in some use cases is undesirable. Follow below guides to collect trace data.
Exporting builtin trace data to Honeycomb
To enable data export, create a Honeycomb API key and add it as a secret to
baseten. Then make add the following settings
to the truss config of the model that you want to enable tracing for:
environment_variables:
HONEYCOMB_DATASET: your_dataset_name
runtime:
enable_tracing_data: true
secrets:
HONEYCOMB_API_KEY: '***'
When making requests to the model, you can provide trace parent IDs with the OTEL
standard header key traceparent
. If not provided Baseten will add random IDs.
An example trace, visualized on Honeycomb, resolving preprocessing, predict and
postprocessing. Additionally, these traces have some span events timing
(de-)serialization inputs and outputs.
Adding custom OTEL instrumentation
If you want a different resolution of tracing spans and event recording, you also add
your own OTEL tracing implementation.
We made sure that our builtin tracing instrumentation
does not mix the trace context with user defined tracing.
import time
from typing import Any, Generator
import opentelemetry.exporter.otlp.proto.http.trace_exporter as oltp_exporter
import opentelemetry.sdk.resources as resources
import opentelemetry.sdk.trace as sdk_trace
import opentelemetry.sdk.trace.export as trace_export
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
trace.set_tracer_provider(
TracerProvider(resource=Resource.create({resources.SERVICE_NAME: "UserModel"}))
)
tracer = trace.get_tracer(__name__)
trace_provider = trace.get_tracer_provider()
class Model:
def __init__(self, **kwargs) -> None:
honeycomb_api_key = kwargs["secrets"]["HONEYCOMB_API_KEY"]
honeycomb_exporter = oltp_exporter.OTLPSpanExporter(
endpoint="https://api.honeycomb.io/v1/traces",
headers={
"x-honeycomb-team" : honeycomb_api_key,
"x-honeycomb-dataset": "marius_testing_user",
},
)
honeycomb_processor = sdk_trace.export.BatchSpanProcessor(honeycomb_exporter)
trace_provider.add_span_processor(honeycomb_processor)
@tracer.start_as_current_span("load_model")
def load(self):
...
def preprocess(self, model_input):
with tracer.start_as_current_span("preprocess"):
...
return model_input
@tracer.start_as_current_span("predict")
def predict(self, model_input: Any) -> Generator[str, None, None]:
with tracer.start_as_current_span("start-predict") as span:
def inner():
time.sleep(0.01)
for i in range(5):
span.add_event("yield")
yield str(i)
return inner()