Skip to main content
Development deployments let you iterate on your model without redeploying from scratch each time you make a change. When you save a file, Truss detects the change, calculates a patch, and applies it to the running deployment in seconds.

Start a development deployment

Create a development deployment and start watching for changes:
truss push --watch
Truss creates a development deployment, waits for it to build, and begins watching your project directory for file changes. Once the deployment reaches the LOADING_MODEL stage, Truss enters watch mode early so you can start iterating while the model finishes loading.
🪵  View logs for your deployment at https://app.baseten.co/models/abc1d2ef/logs/xyz123
👀 Watching for changes to truss...

Re-attach to a development deployment

If you stop the watch session (Ctrl+C), re-attach to the existing development deployment with:
truss watch
You should see:
🪵  View logs for your development model at https://app.baseten.co/models/abc1d2ef/logs/xyz123
🚰 Attempting to sync truss with remote
No changes observed, skipping patching.
👀 Watching for new changes.
truss watch syncs any changes made while disconnected, then resumes watching. It requires an existing development deployment. If you don’t have one, use truss push --watch to create it.

What gets live-patched

Truss monitors your project directory (respecting .trussignore patterns) and applies patches for the following changes without a full rebuild:
Change typeExamples
Model codeFiles in the model/ directory: model.py, helper modules, utilities.
Bundled packagesFiles in the packages/ directory.
Python requirementsAdding, removing, or updating packages in requirements or a requirements file.
Environment variablesAdding, removing, or updating values in environment_variables.
External dataAdding or removing entries in external_data.
Config valuesMost config.yaml changes (except those listed below).

What requires a full redeploy

The patch system doesn’t support some changes. When you make these changes, stop the watch session and run truss push (or truss push --watch to start a new development deployment):
Change typeWhy
resources (GPU type, count)Requires a new instance.
python_versionRequires a new base image.
system_packagesRequires apt installation in the container.
live_reloadChanges the deployment mode.
Data directory (data/)The patch system doesn’t track file changes in data/.
If a patch fails, Truss prints an error and continues watching. Fix the issue in your source files and save again. For persistent failures, run truss push --watch to start fresh.

Limitations

Development deployments optimize for iteration, not production traffic:
  • Single replica: Fixed at 0 minimum, 1 maximum. No autoscaling beyond one replica.
  • No gRPC: Trusses with gRPC transport require a published deployment.
  • No TRT-LLM engine builds: TRT-LLM build flow requires a published deployment.
See Development deployments for the full autoscaling constraints.

Deploy to production

When you’re done iterating, deploy a published version:
truss push
By default, truss push creates a published deployment with full autoscaling support. Published deployments can scale to multiple replicas and are suitable for production traffic. To deploy and promote directly to the production environment:
truss push --promote