Developing a Model on Baseten

Baseten makes it easy to go from a trained machine learning model to a fully-deployed, production-ready API. You’ll use Truss—our open-source model packaging tool—to containerize your model code and configuration, and ship it to Baseten for deployment, testing, and scaling.

What does it mean to develop a model?

In Baseten, developing a model means:

Packaging your model code and weights: Wrap your trained model into a structured project that includes your inference logic and dependencies.
Configuring the model environment: Define everything needed to run your model—from Python packages to system dependencies and secrets.
Deploying and iterating quickly: Push your model to Baseten in development mode and make live edits with instant feedback.

Once your model works the way you want, you can promote it to production, ready for live traffic.

Development flow on Baseten

Here’s what the typical model development loop looks like:

Initialize a new model project using the Truss CLI.
Add your model logic to a Python class (model.py), specifying how to load and run inference.
Configure dependencies in a YAML or Python config.
Deploy the model in development mode using truss push.
Iterate fast with truss watch—live-reload your dev deployment as you make changes.
Test and tune the model until it’s production-ready.
Promote the model to production when you’re ready to scale.

Note: Truss runs your model in a standardized container without needing Docker installed locally. It also gives you a fast developer loop and a consistent way to configure and serve models.

What is Truss?

Truss is the tool you use to:

Scaffold a new model project
Serve models locally or in the cloud
Package your code, config, and model files
Push to Baseten for deployment

You can think of it as the developer toolkit for building and managing model servers—built specifically for machine learning workflows. With Truss, you can create a containerized model server without needing to learn Docker, and define everything about how your model runs: Python and system packages, GPU settings, environment variables, and custom inference logic. It gives you a fast, reproducible dev loop—test changes locally or in a remote environment that mirrors production. Truss is flexible enough to support a wide range of ML stacks, including:

Model frameworks like PyTorch, transformers, and diffusers
Inference engines like TensorRT-LLM, SGLang, vLLM
Serving technologies like Triton
Any package installable with pip or apt

We’ll use Truss throughout this guide, but the focus will stay on how you develop models, not just how Truss works.

From model to server: the key components

When you develop a model on Baseten, you define:

A Model class: This is where your model is loaded, preprocessed, run, and the results returned.
A configuration file (config.yaml or Python config): Defines the runtime environment, dependencies, and deployment settings.
Optional extra assets, like model weights, secrets, or external packages.

These components together form a Truss, which is what you deploy to Baseten. Truss simplifies and standardizes model packaging for seamless deployment. It encapsulates model code, dependencies, and configurations into a portable, reproducible structure, enabling efficient development, scaling, and optimization.

Development vs. other deployments

The only special deployment is development.

Development deployment Meant for iteration and testing. It supports live-reloading for quick feedback loops and will only scale to one replica, no autoscaling.
All others deployments Stable, autoscaled, and ready for live traffic but don’t support live-reloading.

You’ll use the dev deployment to build and test, then promote it to an environment like staging or production once you’re satisfied.

Get started

Concepts

Development

Deployment

Inference

Training

Observability

Troubleshooting

Developing a Model on Baseten

What does it mean to develop a model?

Development flow on Baseten

What is Truss?

From model to server: the key components

Development vs. other deployments

Get started

Concepts

Development

Deployment

Inference

Training

Observability

Troubleshooting

​What does it mean to develop a model?

​Development flow on Baseten

​What is Truss?

​From model to server: the key components

​Development vs. other deployments

What does it mean to develop a model?

Development flow on Baseten

What is Truss?

From model to server: the key components

Development vs. other deployments