Developing a Model on Baseten
This page introduces the key concepts and workflow you’ll use to package, configure, and iterate on models using Baseten’s developer tooling.
Baseten makes it easy to go from a trained machine learning model to a fully-deployed, production-ready API. You’ll use Truss—our open-source model packaging tool—to containerize your model code and configuration, and ship it to Baseten for deployment, testing, and scaling.
What does it mean to develop a model?
In Baseten, developing a model means:
-
Packaging your model code and weights: Wrap your trained model into a structured project that includes your inference logic and dependencies.
-
Configuring the model environment: Define everything needed to run your model—from Python packages to system dependencies and secrets.
-
Deploying and iterating quickly: Push your model to Baseten in development mode and make live edits with instant feedback.
Once your model works the way you want, you can promote it to production, ready for live traffic.
Development flow on Baseten
Here’s what the typical model development loop looks like:
-
Initialize a new model project using the Truss CLI.
-
Add your model logic to a Python class (model.py), specifying how to load and run inference.
-
Configure dependencies in a YAML or Python config.
-
Deploy the model in development mode using truss push.
-
Iterate fast with truss watch—live-reload your dev deployment as you make changes.
-
Test and tune the model until it’s production-ready.
-
Promote the model to production when you’re ready to scale.
Note: Truss runs your model in a standardized container without needing Docker installed locally. It also gives you a fast developer loop and a consistent way to configure and serve models.
What is Truss?
Truss is the tool you use to:
- Scaffold a new model project
- Serve models locally or in the cloud
- Package your code, config, and model files
- Push to Baseten for deployment
You can think of it as the developer toolkit for building and managing model servers—built specifically for machine learning workflows.
We’ll use Truss throughout this guide, but the focus will stay on how you develop models, not just how Truss works.
From model to server: the key components
When you develop a model on Baseten, you define:
-
A
Model
class: This is where your model is loaded, preprocessed, run, and the results returned. -
A configuration file (
config.yaml
or Python config): Defines the runtime environment, dependencies, and deployment settings. -
Optional extra assets, like model weights, secrets, or external packages.
These components together form a Truss, which is what you deploy to Baseten.
Truss simplifies and standardizes model packaging for seamless deployment. It encapsulates model code, dependencies, and configurations into a portable, reproducible structure, enabling efficient development, scaling, and optimization.
Development vs. other deployments
The only special deployment is development.
-
Development deployment Meant for iteration and testing. It supports live-reloading for quick feedback loops and will only scale to one replica, no autoscaling.
-
All others deployments Stable, autoscaled, and ready for live traffic but don’t support live-reloading.
You’ll use the dev deployment to build and test, then promote it to an environment like staging or production once you’re satisfied.
Was this page helpful?