Build your first Chain
Build and deploy two example Chains
This quickstart guide contains instructions for creating two Chains:
- A simple CPU-only โhello worldโ-Chain.
- A Chain that implements Phi-3 Mini and uses it to write poems.
Prerequisites
To use Chains, install a recent Truss version and ensure pydantic is v2:
To deploy Chains remotely, you also need a
Baseten account.
It is handy to export your API key to the current shell session or permanently in your .bashrc
:
Example: Hello World
Chains are written in Python files. In your working directory,
create hello_chain/hello.py
:
In the file, weโll specify a basic Chain. It has two Chainlets:
HelloWorld
, the entrypoint, which handles the input and output.RandInt
, which generates a random integer. It is used a as a dependency byHelloWorld
.
Via the entrypoint, the Chain takes a maximum value and returns the string โ Hello World!โ repeated a variable number of times.
The Chainlet class-contract
Exactly one Chainlet must be marked as the entrypoint with
the @chains.mark_entrypoint
decorator. This Chainlet is responsible for
handling public-facing input and output for the whole Chain in response to an
API call.
A Chainlet class has a single public method,
run_remote()
, which is
the API
endpoint for the entrypoint Chainlet and the function that other Chainlets can
use as a dependency. The
run_remote()
method must be fully type-annotated
with
or .
Chainlets cannot be instantiated. The only correct usages are:
- Make one Chainlet depend on another one via the
chains.depends()
directive as an__init__
-argument as shown above for theRandInt
Chainlet. - In the local debugging mode.
Beyond that, you can structure your code as you like, with private methods, imports from other files, and so forth.
Keep in mind that Chainlets are intended for distributed, replicated, remote execution, so using global variables, global state, and certain Python features like importing modules dynamically at runtime should be avoided as they may not work as intended.
Deploy your Chain to Baseten
To deploy your Chain to Baseten, run:
The deploy command results in an output like this:
Wait for the status to turn to ACTIVE
and test invoking your Chain (replace
$INVOCATION_URL
in below command):
Example: Poetry with LLMs
Our second example also has two Chainlets, but is somewhat more complex and realistic. The Chainlets are:
PoemGenerator
, the entrypoint, which handles the input and output and orchestrates calls to the LLM.PhiLLM
, which runs inference on Phi-3 Mini.
This Chain takes a list of words and returns a poem about each word, written by Phi-3. Hereโs the architecture:
We build this Chain in a new working directory (if you are still inside
hello_chain/
, go up one level with cd ..
first):
A similar ent-to-end code example, using Mistral as an LLM, is available in the examples repo.
Building the LLM Chainlet
The main difference between this Chain and the previous one is that we now have an LLM that needs a GPU and more complex dependencies.
Copy the following code into poems.py
:
Building the entrypoint
Now that we have an LLM, we can use it in a poem generator Chainlet. Add the
following code to poems.py
:
Note that we use asyncio.ensure_future
around each RPC to the LLM chainlet.
This makes the current python process start these remote calls concurrently,
i.e. the next call is started before the previous one has finished and we can
minimize our overall runtime. In order to await the results of all calls,
asyncio.gather
is used which gives us back normal python objects.
If the LLM is hit with many concurrent requests, it can auto-scale up (if
autoscaling is configure). More advanced LLM models have batching capabilities,
so for those even a single instance can serve concurrent request.
Deploy your Chain to Baseten
To deploy your Chain to Baseten, run:
Wait for the status to turn to ACTIVE
and test invoking your Chain (replace
$INVOCATION_URL
in below command):