Development vs production
Model lifecycle on Baseten
There are two special kinds of model deployments on Baseten: development and production. Both development and production deployments have different features from other published deployments to match their role in the model lifecycle.
Feature | Development deployment | Production deployment | Other published deployments |
---|---|---|---|
API: Deployment ID | ☑️ | ☑️ | ☑️ |
API: Model ID | - | ☑️ | - |
Live reload | ☑️ | - | - |
Scale to zero | ☑️ | ☑️ | ☑️ |
Full autoscaling | - | ☑️ | ☑️ |
Zero-downtime updates | - | ☑️ | ☑️ |
Deactivate | ☑️ | ☑️ | ☑️ |
Delete | ☑️ | - | ☑️ |
Rename | - | ☑️ | ☑️ |
What is a development deployment?
A development deployment is designed to make it easier for you to iterate on your model in an environment that closely matches production. As such, development deployments have three special properties:
- Development deployments have live reload so you can patch changes onto the model server while it runs.
- Development deployments don’t have access to full autoscaling. They have a maximum of one replica and always scale to zero when not in use.
- Development deployments do not guarantee zero-downtime updates. A development deployment may be updated at any time, which may cause active requests to fail.
- Development deployments are always named development and cannot be renamed.
Live reload lets you use the Truss CLI to patch changes onto your running model server rather than waiting for an entirely new deployment.
What is a production deployment?
A production deployment is the main deployment of your model designated for production use.
When promoting from development to production, there are a few changes:
- The production deployment uses the production endpoint rather than the development endpoint.
- The production deployment has full access to autoscaling.
- Changes made to the development deployment do not affect the production deployment. Promoting a new deployment to production is the only way to update the production deployment.
- The deployment name can be changed to make it easier to identify.
A production deployment is fairly similar to any other published deployment, with two differences:
- A production deployment can be called using the production endpoint.
- A production deployment cannot be deleted (unless you delete the entire model).
Promoting to production
Any active deployment can be promoted to production.
Promoting the development deployment to production
Promoting the development deployment to production triggers a three-step process:
- A new deployment is created, with a new deployment ID and name. It is not yet the production deployment.
- The new deployment is allocated resources and started up.
- Once active, the new deployment becomes the production deployment, replacing any existing production deployment.
- If there was no existing production deployment, the new deployment is created with standard autoscaling settings.
- If there was an existing production deployment, the new deployment is created with the same autoscaling settings as the existing production deployment. The former production deployment is demoted but keeps its ID, autoscaling settings, and activity status.
Promoting the development deployment to production does not change the development deployment’s ID, autoscaling settings, or activity status. You can continue to iterate on the development deployment with live reload.
Promoting another published deployment to production
When promoting an already published deployment to production, keep the following in mind:
- A published deployment keeps its deployment ID and autoscaling settings when being promoted to production (unlike development deployments).
- If the deployment is scaled to zero, it stays scaled to zero when promoted. You can manually wake the model before promoting it.
- If the deployment is inactive, you must activate it and wait for it to start up before promoting it to production.
The previous production deployment keeps its deployment ID, autoscaling settings, and activity status and joins other deployments in the deployment list.
Deploying directly to production
You can deploy a model directly to production, skipping the development stage and replacing any existing production deployment, by adding --promote
to truss push
:
cd my_model/
truss push --promote
Deactivating deployments
Any active deployment, including the production deployment, can be deactivated.
- Deactivated deployments remain visible in the model dashboard.
- Deactivated deployments do not consume model resources.
- Requests to a deactivated deployment’s endpoints will result in a 404 error.
- A deactivated deployment can be manually activated at any time from the model dashboard.
If you’re regularly activating or deactivating deployments in response to traffic, consider using autoscaling with scale to zero instead.
Deleting deployments
Any deployment of a model can be deleted except the production deployment. To delete the production deployment, first promote a different deployment to production (or delete the entire model).
- Deleted deployments are removed from the model dashboard but will appear in your billing and usage dashboard.
- Deleted deployments do not consume model resources.
- Requests to a deleted deployment’s endpoints will result in a 404 error.
- Deleting a deployment is a permanent action and cannot be undone.
If you aren’t completely certain about deleting a deployment, consider deactivating it instead until you’re ready to delete.