What does it mean to develop a model?
In Baseten, developing a model means:- Packaging your model code and weights: Wrap your trained model into a structured project that includes your inference logic and dependencies.
- Configuring the model environment: Define everything needed to run your model—from Python packages to system dependencies and secrets.
- Deploying and iterating quickly: Push your model to Baseten in development mode and make live edits with instant feedback.
Development flow on Baseten
Here’s what the typical model development loop looks like:- Initialize a new model project using the Truss CLI.
- Add your model logic to a Python class (model.py), specifying how to load and run inference.
- Configure dependencies in a YAML or Python config.
- Deploy the model in development mode using truss push.
- Iterate fast with truss watch—live-reload your dev deployment as you make changes.
- Test and tune the model until it’s production-ready.
- Promote the model to production when you’re ready to scale.

Note: Truss runs your model in a standardized container without needing
Docker installed locally. It also gives you a fast developer loop and a
consistent way to configure and serve models.
What is Truss?
Truss is the tool you use to:- Scaffold a new model project
- Serve models locally or in the cloud
- Package your code, config, and model files
- Push to Baseten for deployment
- Model frameworks like PyTorch, transformers, and diffusers
- Inference engines like TensorRT-LLM, SGLang, vLLM
- Serving technologies like Triton
- Any package installable with
pip
orapt
From model to server: the key components
When you develop a model on Baseten, you define:- A
Model
class: This is where your model is loaded, preprocessed, run, and the results returned. - A configuration file (
config.yaml
or Python config): Defines the runtime environment, dependencies, and deployment settings. - Optional extra assets, like model weights, secrets, or external packages.
Development vs. other deployments
The only special deployment is development.- Development deployment Meant for iteration and testing. It supports live-reloading for quick feedback loops and will only scale to one replica, no autoscaling.
- All others deployments Stable, autoscaled, and ready for live traffic but don’t support live-reloading.