Low-level streaming
Low-level, streaming works by sending byte chunks (unicode strings will be implicitly encoded) via HTTP. The most primitive way of doing this in Chains is by implementingrun_remote as a bytes- or string-iterator, e.g.:
Server-sent events (SSEs)
A possible choice is to generate chunks that comply with the specification of server-sent events. Concretely, sending JSON strings withdata, event and potentially
other fields and content-type text/event-stream .
However, the SSE specification is not opinionated regarding what exactly is
encoded in data and what event-types exist - you have to make up your schema
that is useful for the client that consumes the data.
Pydantic and Chainlet-Chainlet-streams
While above low-level streaming is stable, the following helper APIs for typed
streaming are only stable for intra-Chain streaming.If you want to use them for end clients, please reach out to Baseten support,
so we can discuss the stable solutions.
Headers and footers
This also helps to solve another challenge of streaming: you might want to send data of different kinds at the beginning or end of a stream than in the “main” part. For example if you transcribe an audio file, you might want to send many transcription segments in a stream and at the end send some aggregate information such as duration, detected languages etc. We model typed streaming like this:- [optionally] send a chunk that conforms to the schema of a Headerpydantic model.
- Send 0 to N chunks each conforming to the schema of an Itempydantic model.
- [optionally] send a chunk that conforms to the schema of a Footerpydantic model.
APIs
StreamTypes
To have a single source of truth for the types that can be shared between the producing Chainlet and the consuming client (either a Chainlet in the Chain or an external client), the chains framework uses aStreamType-object:
StreamWriter
Use theSTREAM_TYPES to create a matching stream writer:
yield_header and yield_footer methods are available on the writer.
The writer serializes the pydantic data to bytes, so you can also
efficiently represent numeric data (see the
binary IO guide).
StreamReader
To consume the stream on either another Chainlet or in the external client, a matchingStreamReader is created form your StreamTypes. Besides the
types, you connect the reader to the bytes generator that you obtain from the
remote invocation of the streaming Chainlet:
read_header and
read_footer methods.
Note that the stream can only be consumed once and you have to consume
header, items and footer in order.
The implementation of 
StreamReader only needs pydantic, no other Chains
dependencies. So you can take that implementation code in isolation and
integrate it in your client code.