Model packaging guides
Returning response objects and SSEs
Get more control by directly creating the response object.
Classically, the truss server wraps the prediction results of your custom model into a response object to be sent back via HTTP to the client.
In advanced use case you might want to create these response objects yourself. Example use cases are:
- Control over the HTTP status codes.
- With streaming responses, you can use server-side-events (SSEs).
There is likewise support for
using request objects.
import fastapi
class Model:
def predict(self, inputs) -> fastapi.Response:
return fastapi.Response(...)
You can return a response from either predict
or postprocess
and
any subclasses from starlette.responses.Response
are supported.
If you return a response from
predict
, you cannot use
postprocessing
. SSE / Streaming example
from starlette.responses import StreamingResponse
class Model:
def predict(self, model_input):
def event_stream():
while True:
time.sleep(1)
yield ("data: Server Time: "
f"{time.strftime('%Y-%m-%d %H:%M:%S')}\n\n"
return StreamingResponse(event_stream(), media_type="text/event-stream")
Response headers are not fully propagated. Include all information in the response itself.