Local Development

Chains are designed for production in replicated remote deployments. But alongside that production-ready power, we offer great local development and deployment experiences.

Locally, a Chain is just Python files in a source tree. While that gives you a lot of flexibility in how you structure your code, there are some constraints and rules to follow to ensure successful distributed, remote execution in production.

The best thing you can do while developing locally with Chains is to run your code frequently, even if you do not have a __main__ section: the Chains framework runs various validations at to help you catch issues early.

Additionally, running mypy and fixing reported type errors can help you find problems early in a rapid feedback loop, before attempting a (much slower) deployment.

Complementary to the purely local development Chains also has a “watch” mode, like Truss, see the watch guide.

Test a Chain locally

Let’s revisit our “Hello World” Chain:

hello_chain/hello.py
import asyncio
import truss_chains as chains


# This Chainlet does the work
class SayHello(chains.ChainletBase):

    async def run_remote(self, name: str) -> str:
        return f"Hello, {name}"


# This Chainlet orchestrates the work
@chains.mark_entrypoint
class HelloAll(chains.ChainletBase):

    def __init__(self, say_hello_chainlet=chains.depends(SayHello)) -> None:
        self._say_hello = say_hello_chainlet

    async def run_remote(self, names: list[str]) -> str:
        tasks = []
        for name in names:
            tasks.append(asyncio.ensure_future(
                self._say_hello.run_remote(name)))
        
        return "\n".join(await asyncio.gather(*tasks))


# Test the Chain locally
if __name__ == "__main__":
    with chains.run_local():
        hello_chain = HelloAll()
        result = asyncio.get_event_loop().run_until_complete(
            hello_chain.run_remote(["Marius", "Sid", "Bola"]))
        print(result)

When the __main__() module is run, local instances of the Chainlets are created, allowing you to test functionality of your chain just by executing the Python file:

cd hello_chain
python hello.py
# Hello, Marius
# Hello, Sid
# Hello, Bola

Mock execution of GPU Chainlets

Using run_local() to run your code locally requires that your development environment have the compute resources and dependencies that each Chainlet needs. But that often isn’t possible when building with AI models.

Chains offers a workaround, mocking, to let you test the coordination and business logic of your multi-step inference pipeline without worrying about running the model locally.

The second example in the getting started guide implements a Truss Chain for generating poems with Phi-3.

This Chain has two Chainlets:

The PhiLLM Chainlet, which requires an NVIDIA A10G GPU.
The PoemGenerator Chainlet, which easily runs on a CPU.

If you have an NVIDIA T4 under your desk, good for you. For the rest of us, we can mock the PhiLLM Chainlet that is infeasible to run locally so that we can quickly test the PoemGenerator Chainlet.

To do this, we define a mock Phi-3 model in our __main__ module and give it a run_remote() method that produces a test output that matches the output type we expect from the real Chainlet. Then, we inject an instance of this mock Chainlet into our Chain:

poems.py
if __name__ == "__main__":
    class FakePhiLLM:
        async def run_remote(self, prompt: str) -> str:
            return f"Here's a poem about {prompt.split(" ")[-1]}"


    with chains.run_local():
        poem_generator = PoemGenerator(phi_llm=FakePhiLLM())
        result = asyncio.get_event_loop().run_until_complete(
            poem_generator.run_remote(words=["bird", "plane", "superman"]))
        print(result)

And run your Python file:

python poems.py
# ['Here's a poem about bird', 'Here's a poem about plane', 'Here's a poem about superman']

Typing of mocks

You may notice that the argument phi_llm expects a type PhiLLM, while we pass an instance of FakePhiLLM. These aren’t the same, which is formally a type error.

However, this works at runtime because we constructed FakePhiLLM to implement the same protocol as the real thing. We can make this explicit by defining a Protocol as a type annotation:

from typing import Protocol


class PhiProtocol(Protocol):
    def run_remote(self, data: str) -> str:
        ...

and changing the argument type in PoemGenerator:

@chains.mark_entrypoint
class PoemGenerator(chains.ChainletBase):
    def __init__(self, phi_llm: PhiProtocol = chains.depends(PhiLLM)) -> None:
        self._phi_llm = phi_llm

This is a bit more work and not needed to execute the code, but it shows how typing consistency can be achieved - if desired.

Welcome

Writing Truss models 📦

Model deployment ☁️

Inference 📨

Chains ⛓️

Performance ⚡

Observability 📊

Local Development

Test a Chain locally

Mock execution of GPU Chainlets

Typing of mocks

Welcome

Writing Truss models 📦

Model deployment ☁️

Inference 📨

Chains ⛓️

Performance ⚡

Observability 📊

​Test a Chain locally

​Mock execution of GPU Chainlets

​Typing of mocks

Test a Chain locally

Mock execution of GPU Chainlets

Typing of mocks