Deployments can be promoted to an environment (e.g., “staging”) to validate outputs before moving to production, allowing for safer model iteration and evaluation.
Using environments to manage deployments
Environments support structured validation before promoting a deployment, including:- Automated tests and evaluations
- Manual testing in pre-production
- Gradual traffic shifts with canary deployments
- Shadow serving for real-world analysis
- Dedicated API endpoint → Predict API Reference
- Autoscaling controls → Scale behavior is managed per environment.
- Traffic ramp-up → Enable canary rollouts or rolling deployments.
- Monitoring and metrics → Export environment metrics.
- It can’t be deleted unless the entire model is removed.
- You can’t create additional environments named “production.”
Creating custom environments
In addition to the standard production environment, you can create as many custom environments as needed. There are two ways to create a custom environment:- In the model management page on the Baseten dashboard.
- Via the create environment endpoint in the model management API.
Promoting deployments to environments
When you promote a deployment, Baseten follows a three-step process:- A new deployment is created with a unique deployment ID.
- The deployment initializes resources and becomes active.
- The new deployment replaces the existing deployment in that environment.
- If there was no previous deployment, default autoscaling settings are applied.
- If a previous deployment existed, the new one inherits autoscaling settings, and the old deployment is demoted and scales to zero.
Promoting a published deployment
If a published deployment (not a development deployment) is promoted:- Its autoscaling settings are updated to match the environment.
- If inactive, it must be activated before promotion.
Deploying directly to an environment
You can deploy directly to a named environment by specifying--environment in truss push:
Only one active promotion per environment is allowed at a time.
Accessing environments in your code
The environment name is available inmodel.py via the environment keyword argument:
Regional environments
Regional environments restrict inference traffic to a specific geographic region for data residency compliance. When your organization enables regional environments, each environment gets a dedicated regional endpoint that routes directly to infrastructure in the designated region.Your Baseten account team configures regional environments at the organization level. Contact them to enable regional environments.
Regional endpoint format
Regional endpoints embed the environment name in the hostname instead of the URL path:- Model
- Chain
- WebSocket
- gRPC
Call a model’s regional endpoint with For example, a model with ID
/predict or /async_predict.abc123 in the prod-us environment:API restrictions on regional endpoints
Regional endpoints derive the environment exclusively from the hostname. Path-based routing (/environments/, /production/, /deployment/) is rejected. For gRPC, do not set x-baseten-environment or x-baseten-deployment metadata headers.
Deleting environments
You can delete environments, except for production. To remove a production deployment, first promote another deployment to production or delete the entire model.- Deleted environments are removed from the overview but remain in billing history.
- They do not consume resources after deletion.
- API requests to a deleted environment return a 404 error.