model.py file. Truss provides a Model class with three methods (__init__, load, and predict) that give you full control over how your model initializes, loads weights, and handles requests.
Most deployments don’t need custom Python at all. If you’re deploying a supported open-source model, see Your first model for the config-only approach. Use custom model code when you need to:
- Run a model architecture that Baseten’s engines don’t support.
- Add custom preprocessing or postprocessing around inference.
- Combine multiple models or libraries in a single endpoint.
Prerequisites
To use Truss, install a recent Truss version and ensure pydantic is v2:Help for setting up a clean development environment
Help for setting up a clean development environment
Truss requires python
>=3.9,<3.15. To set up a fresh development environment,
you can use the following commands, creating a environment named truss_env
using pyenv:.bashrc:
~/.bashrc
Initialize your model
Create a new Truss project withtruss init.
config.yaml: Configuration for dependencies, resources, and deployment settings.model/model.py: Your model code.packages/: Optional local Python packages.data/: Optional data files bundled with your model.
config.yaml
Theconfig.yaml file configures dependencies, resources, and other settings. Here’s the default:
config.yaml
requirements: Python packages installed at build time (pip format).resources: CPU, memory, and GPU allocation.secrets: Secret names your model needs at runtime, such as HuggingFace API keys.
model.py
Themodel.py file defines a Model class with three methods:
__init__: Runs when the class is created. Initialize variables and store configuration here.load: Runs once at startup, before any requests. Load model weights, tokenizers, and other heavy resources here. Separating this from__init__keeps expensive operations out of the request path.predict: Runs on every API request. Process input, run inference, and return the response.
Deploy your model
Deploy withtruss push.
Invoke your model
After deployment, call your model at the invocation URL:Example: text classification
To see theModel class in action, deploy a text classification model from HuggingFace using the transformers library.
config.yaml
Addtransformers and torch as dependencies:
config.yaml
model.py
Load the classification pipeline inload and run it in predict:
model.py
Deploy and call
Deploy withtruss push, then call the endpoint:
Next steps
- Configuration: Full reference for
config.yamloptions. - Implementation: Advanced model patterns including streaming, async, and custom health checks.
- Your first model: Deploy a model with just a config file, no custom Python needed.