Chainlet
A Chainlet is the basic building block of Chains. A Chainlet is a Python class that specifies:- A set of compute resources.
- A Python environment with software dependencies.
- A typed interface
run_remote()
for other Chainlets to call.
run_remote()
method is
required — and we can layer in other concepts to create a more capable Chainlet.
Remote configuration
Chainlets are meant for deployment as remote services. Each Chainlet specifies its own requirements for compute hardware (CPU count, GPU type and count, etc) and software dependencies (Python libraries or system packages). This configuration is built into a Docker image automatically as part of the deployment process. When no configuration is provided, the Chainlet will be deployed on a basic instance with one vCPU, 2GB of RAM, no GPU, and a standard set of Python and system packages. Configuration is set using theremote_config
class variable
within the Chainlet:
Initialization
Chainlets are implemented as classes because we often want to set up expensive static resources once at startup and then re-use it with each invocation of the Chainlet. For example, we only want to initialize an AI model and download its weights once then re-use it every time we run inference. We do this setup in__init__()
, which is run exactly once when the Chainlet is
deployed or scaled up.
Context (access information)
You can addDeploymentContext
object as an optional argument to the __init__
-method of a Chainlet.
This allows you to use secrets within your Chainlet, such as using
a hf_access_token
to access a gated model on Hugging Face (note that when
using secrets, they also need to be added to the assets
).
Depends (call other Chainlets)
The Chains framework uses thechains.depends()
function in
Chainlets’ __init__()
method to track the dependency relationship between
different Chainlets within a Chain.
This syntax, inspired by dependency injection, is used to translate local Python
function calls into calls to the remote Chainlets in production.
Once a dependency Chainlet is added with
chains.depends()
, its
run_remote()
method can
call this dependency Chainlet, e.g. below HelloAll
we can make calls to
SayHello
:
Run remote (chaining Chainlets)
Therun_remote()
method is run each time the Chainlet is called. It is the
sole public interface for the Chainlet (though you can have as many private
helper functions as you want) and its inputs and outputs must have type
annotations.
In run_remote()
you implement the actual work of the Chainlet, such as model
inference or data chunking:
async
method and using async APIs for
doing all the work (e.g. downloads, vLLM or TRT inference).
It is possible to stream results back, see our
streaming guide.
If
run_remote()
makes calls to other Chainlets, e.g. invoking a dependency
Chainlet for each element in a list, you can benefit from concurrent
execution, by making the run_remote()
an async
method and starting the
calls as concurrent tasks
asyncio.ensure_future(self._dep_chainlet.run_remote(...))
.Entrypoint
The entrypoint is called directly from the deployed Chain’s API endpoint and kicks off the entire chain. The entrypoint is also responsible for returning the final result back to the client. Using the@chains.mark_entrypoint
decorator, one Chainlet within a file is set as the entrypoint to the chain.
I/O and pydantic
data types
To make orchestrating multiple remotely deployed services possible, Chains
relies heavily on typed inputs and outputs. Values must be serialized to a safe
exchange format to be sent over the network.
The Chains framework uses the type annotations to infer how data should be
serialized and currently is restricted to types that are JSON compatible. Types
can be:
- Direct type annotations for simple types such as
int
,float
, orlist[str]
. - Pydantic models to define a schema for nested data structures or multiple arguments.
Chains compared to Truss
Tips for Truss users
Tips for Truss users
Chains is an alternate SDK for packaging and deploying AI models. It carries over many features and concepts from Truss and gives you access to the benefits of Baseten (resource provisioning, autoscaling, fast cold starts, etc), but it is not a 1-1 replacement for Truss.Here are some key differences:
- Rather than running
truss init
and creating a Truss in a directory, a Chain is a single file, giving you more flexibility for implementing multi-step model inference. Create an example withtruss chains init
. - Configuration is done inline in typed Python code rather than in a
config.yaml
file. - While Chainlets are converted to Truss models when run on Baseten,
Chainlet != TrussModel
.