Links

Quickstart guide

Package and deploy a model in less than five minutes.
This guide shows how to:
  • Package a model and its dependencies with Truss
  • Deploy a packaged model to Baseten
  • Invoke the deployed model
In less than five minutes, you'll have a transformers text classification pipeline deployed behind an autoscaling API endpoint.

Install Baseten

To deploy models, you need an API key and the Baseten Python client.
To get the API key:
In your terminal, run:
pip install --upgrade baseten
To authenticate with your API key, run:
baseten login
You will be prompted to paste your API key after running the command.

Package a model

In this doc, we'll deploy a text classification model. We're going to use a text classification pipeline from the open-source transformers package, which includes many pre-trained models.
We'll use Truss, an open-source model packaging library maintained by Baseten, to package the model.

Create a Truss

To get started, create a Truss with the following terminal command:
truss init text-classification
This will create an empty Truss at ./text-classification.

Implement the model

The model serving code goes in ./text-classification/model/model.py in your newly created Truss.
from typing import List
from transformers import pipeline
class Model:
def __init__(self, **kwargs) -> None:
self._model = None
def load(self):
self._model = pipeline("text-classification")
def predict(self, model_input: str) -> List:
return self._model(model_input)
There are two functions to implement:
  • load() runs once when the model is spun up and is responsible for initializing self._model
  • predict() runs each time the model is invoked and handles the inference. It can use any JSON-serializable type as input and output.

Add model dependencies

The pipeline model relies on Transformers and PyTorch. These dependencies must be specified in the Truss config.
In ./text-classification/config.yaml, find the line requirements. Replace the empty list with:
requirements:
- torch==2.0.1
- transformers==4.30.0
No other configuration needs to be changed.

Deploy a model

With the model packaged as a Truss, you can now deploy it to Baseten. Run the following Python script:
import truss
import baseten
th = truss.load("text-classification")
baseten.deploy(th, model_name="Quickstart text classification")
The console output includes information about the model, including the model version ID:
...
BasetenDeployedModel<
model_version_id=qzk76xq
name=Quickstart text classification
>
The model will be deployed on Baseten. You can find it on your models page.

Invoke a model

Once the model is deployed, you'll see a green "Active" badge (and receive an email notification).
An active model
You can invoke the model via the Python client:
model = baseten.deployed_model_version_id("$MODEL_VERSION_ID") # $MODEL_VERSION_ID is from the previous step
model.predict("I am very happy that I just deployed this model!")
You'll receive the model output as a list:
[{'label': 'POSITIVE', 'score': 0.9996334314346313}]