By default, Truss wraps prediction results into an HTTP response. For advanced use cases, you can create response objects manually to:

  • Control HTTP status codes.
  • Use server-sent events (SSEs) for streaming responses.
You can return a response from predict or postprocess, but not both.

Returning Custom Response Objects

Any subclass of starlette.responses.Response is supported.

import fastapi

class Model:
    def predict(self, inputs) -> fastapi.Response:
        return fastapi.Response(...)
If predict returns a response, postprocess cannot be used.

Example: Streaming with SSEs

For server-sent events (SSEs), use StreamingResponse:

import time
from starlette.responses import StreamingResponse

class Model:
    def predict(self, model_input):
        def event_stream():
            while True:
                time.sleep(1)
                yield f"data: Server Time: {time.strftime('%Y-%m-%d %H:%M:%S')}\n\n"

        return StreamingResponse(event_stream(), media_type="text/event-stream")

Limitations

  • Response headers are not fully propagated – include metadata in the response body.

Also see Using Request Objects for handling raw requests.