Skip to main content
When you need custom preprocessing, postprocessing, or want to run a model that isn’t supported by Baseten’s built-in engines, you can write Python code in a model.py file. Truss provides a Model class with three methods (__init__, load, and predict) that give you full control over how your model initializes, loads weights, and handles requests. Most deployments don’t need custom Python at all. If you’re deploying a supported open-source model, see Your first model for the config-only approach. Use custom model code when you need to:
  • Run a model architecture that Baseten’s engines don’t support.
  • Add custom preprocessing or postprocessing around inference.
  • Combine multiple models or libraries in a single endpoint.

Prerequisites

To use Truss, install a recent Truss version and ensure pydantic is v2:
pip install --upgrade truss 'pydantic>=2.0.0'
Truss requires python >=3.9,<3.15. To set up a fresh development environment, you can use the following commands, creating a environment named truss_env using pyenv:
curl https://pyenv.run | bash
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
source ~/.bashrc
pyenv install 3.11.0
ENV_NAME="truss_env"
pyenv virtualenv 3.11.0 $ENV_NAME
pyenv activate $ENV_NAME
pip install --upgrade truss 'pydantic>=2.0.0'
To deploy Truss remotely, you also need a Baseten account. It is handy to export your API key to the current shell session or permanently in your .bashrc:
~/.bashrc
export BASETEN_API_KEY="nPh8..."

Initialize your model

Create a new Truss project with truss init.
$ truss init hello-world
? 📦 Name this model: HelloWorld
Truss HelloWorld was created in ~/hello-world
This creates a directory with the following structure:
  • config.yaml: Configuration for dependencies, resources, and deployment settings.
  • model/model.py: Your model code.
  • packages/: Optional local Python packages.
  • data/: Optional data files bundled with your model.

config.yaml

The config.yaml file configures dependencies, resources, and other settings. Here’s the default:
config.yaml
build_commands: []
environment_variables: {}
external_package_dirs: []
model_metadata: {}
model_name: HelloWorld
python_version: py311
requirements: []
resources:
  accelerator: null
  cpu: '1'
  memory: 2Gi
  use_gpu: false
secrets: {}
system_packages: []
The fields you’ll use most often:
  • requirements: Python packages installed at build time (pip format).
  • resources: CPU, memory, and GPU allocation.
  • secrets: Secret names your model needs at runtime, such as HuggingFace API keys.
See the Configuration page for the full reference.

model.py

The model.py file defines a Model class with three methods:
class Model:
    def __init__(self, **kwargs):
        pass

    def load(self):
        pass

    def predict(self, model_input):
        return model_input
  • __init__: Runs when the class is created. Initialize variables and store configuration here.
  • load: Runs once at startup, before any requests. Load model weights, tokenizers, and other heavy resources here. Separating this from __init__ keeps expensive operations out of the request path.
  • predict: Runs on every API request. Process input, run inference, and return the response.

Deploy your model

Deploy with truss push.
$ truss push
This packages your code and config, builds a container, and deploys it to Baseten.

Invoke your model

After deployment, call your model at the invocation URL:
$ curl -X POST https://model-{model-id}.api.baseten.co/development/predict \
  -H "Authorization: Api-Key $BASETEN_API_KEY" \
  -d '"some text"'
You should see:
"some text"

Example: text classification

To see the Model class in action, deploy a text classification model from HuggingFace using the transformers library.

config.yaml

Add transformers and torch as dependencies:
config.yaml
requirements:
  - transformers
  - torch

model.py

Load the classification pipeline in load and run it in predict:
model.py
from transformers import pipeline

class Model:
    def __init__(self, **kwargs):
        pass

    def load(self):
        self._model = pipeline("text-classification")

    def predict(self, model_input):
        return self._model(model_input)

Deploy and call

Deploy with truss push, then call the endpoint:
$ truss push
$ curl -X POST https://model-{model-id}.api.baseten.co/development/predict \
  -H "Authorization: Api-Key $BASETEN_API_KEY" \
  -d '"some text"'

Next steps

  • Configuration: Full reference for config.yaml options.
  • Implementation: Advanced model patterns including streaming, async, and custom health checks.
  • Your first model: Deploy a model with just a config file, no custom Python needed.