Baseten provides two core development workflows: developing a model with Truss and orchestrating models with Chains. Both are building blocks for production-grade ML systems, but they solve different problems.

Truss vs. Chains: When to use each

Developing a model with Truss

Truss is the open-source package you use to turn any ML model into a production-ready API on Baseten.

Use Truss when:

  • You’re deploying a single model. Whether it’s a fine-tuned transformer or a scikit-learn classifier, Truss wraps your model with code, configuration, and environment so it can be deployed and served at scale.

  • You need control over how your model runs. Truss supports custom Python packages, system dependencies, GPU settings, and custom inference logic. You can define pre- and post-processing, batching, and logging — all versioned and deployable.

  • You want to keep development local and reproducible. Truss makes it easy to develop locally in a containerized environment, then push to Baseten with confidence that the same code will run in production.

Orchestrating with Chains

Chains are for building inference workflows that span multiple steps, models, or tools. You define a sequence of steps — like routing, transformation, or chaining outputs — and run them as a single unit.

Use Chains when:

  • You’re combining multiple models or tools. For example, running a vector search + LLM pipeline, or combining OCR, classification, and validation steps.

  • You want visibility into intermediate steps. Chains let you debug and monitor each part of the workflow, retry failed steps, and trace outputs with ease — something that’s much harder with a single model endpoint.

  • You’re using orchestration libraries like LangChain or LlamaIndex. Chains integrate natively with these frameworks, while still allowing you to insert your own logic or wrap Truss models as steps.