Deployments can be promoted to an environment (for example, “staging”) to validate outputs before moving to production, allowing for safer model iteration and evaluation.
Deployment management
Environments support structured validation before promoting a deployment, including:- Automated tests and evaluations.
- Manual testing in pre-production.
- Gradual traffic shifts with canary deployments.
- Shadow serving for real-world analysis.
- Dedicated API endpoint: See the Predict API reference.
- Autoscaling controls: Scale behavior is managed per environment.
- Traffic ramp-up: Supports canary rollouts and rolling deployments.
- Monitoring and metrics: Export environment metrics.
- It can’t be deleted unless the entire model is removed.
- You can’t create additional environments named “production.”
Custom environments
In addition to the standard production environment, you can create as many custom environments as needed:- In the model management page on the Baseten dashboard.
- Via the create environment endpoint in the management API.
Deployment promotion
When you promote a deployment to an environment, Baseten associates the deployment with that environment and applies the environment’s autoscaling settings. If the deployment can be reused directly, promotion completes without creating new resources. Otherwise, Baseten creates a new deployment with a unique ID, initializes its resources, and replaces the existing deployment in that environment. A new deployment is created when:- The deployment is already associated with another environment.
- The environment has a different instance type or resource profile.
- Re-deploy on promotion is enabled.
Published deployment promotion
If a published deployment (not a development deployment) is promoted, its autoscaling settings are updated to match the environment. Previous deployments are demoted but remain in the system.Direct deployment to an environment
You can deploy directly to a named environment by specifying--environment in truss push:
Only one active promotion per environment is allowed at a time.
Environment access in code
The environment name is available inmodel.py via the environment keyword argument:
load() method to configure per-environment behavior:
load(), you’ll need to enable re-deploy on promotion to ensure the environment is correctly initialized after each promotion. See Re-deploy on promotion for details.
Re-deploy on promotion
By default, promoting a deployment reuses the existing deployment when possible. This is the fastest promotion path, but it meansload() doesn’t re-run. Any environment-specific configuration set during the original load() call persists, even if the deployment moves to a different environment.
You can configure an environment to create a fresh deployment every time you promote to it. The new deployment runs load() with the target environment’s context, so environment-specific configuration takes effect.
Enable this if your load() method uses kwargs["environment"] to configure per-environment behavior, or if you promote the same source deployment to multiple environments and want each to get a fresh deployment.
Toggle Re-deploy when promoting in the environment settings on your model’s page in the Baseten dashboard, or set it via the update environment settings endpoint.
If you promote a deployment that’s already associated with an environment, Baseten creates a new deployment regardless of this setting.
Regional environments
Regional environments restrict inference traffic to a specific geographic region for data residency compliance. When your organization enables regional environments, each environment gets a dedicated regional endpoint that routes directly to infrastructure in the designated region.Your Baseten account team configures regional environments at the organization level. Contact them to enable regional environments.
Regional endpoint format
Regional endpoints embed the environment name in the hostname instead of the URL path:- Model
- Chain
- WebSocket
- gRPC
Call a model’s regional endpoint with For example, a model with ID
/predict or /async_predict.abc123 in the prod-us environment:API restrictions on regional endpoints
Regional endpoints derive the environment exclusively from the hostname. Path-based routing (/environments/, /production/, /deployment/) is rejected. For gRPC, don’t set x-baseten-environment or x-baseten-deployment metadata headers.
Environment deletion
You can delete environments, except for production. To remove a production deployment, first promote another deployment to production or delete the entire model.- Deleted environments are removed from the overview but remain in billing history.
- They don’t consume resources after deletion.
- API requests to a deleted environment return a 404 error.