✨
Quickstart guide
Package and deploy a model in less than five minutes.
This guide shows how to:
- Package a model and its dependencies with Truss
- Deploy a packaged model to Baseten
- Invoke the deployed model
In less than five minutes, you'll have a transformers text classification pipeline deployed behind an autoscaling API endpoint.
To deploy models, you need an API key and the Baseten Python client.
To get the API key:
In your terminal, run:
pip install --upgrade baseten
To authenticate with your API key, run:
baseten login
You will be prompted to paste your API key after running the command.
In this doc, we'll deploy a text classification model. We're going to use a text classification pipeline from the open-source
transformers
package, which includes many pre-trained models.We'll use Truss, an open-source model packaging library maintained by Baseten, to package the model.
To get started, create a Truss with the following terminal command:
truss init text-classification
This will create an empty Truss at
./text-classification
.The model serving code goes in
./text-classification/model/model.py
in your newly created Truss.from typing import List
from transformers import pipeline
class Model:
def __init__(self, **kwargs) -> None:
self._model = None
def load(self):
self._model = pipeline("text-classification")
def predict(self, model_input: str) -> List:
return self._model(model_input)
There are two functions to implement:
load()
runs once when the model is spun up and is responsible for initializingself._model
predict()
runs each time the model is invoked and handles the inference. It can use any JSON-serializable type as input and output.
The pipeline model relies on Transformers and PyTorch. These dependencies must be specified in the Truss config.
In
./text-classification/config.yaml
, find the line requirements
. Replace the empty list with:requirements:
- torch==2.0.1
- transformers==4.30.0
No other configuration needs to be changed.
With the model packaged as a Truss, you can now deploy it to Baseten. Run the following Python script:
import truss
import baseten
th = truss.load("text-classification")
baseten.deploy(th, model_name="Quickstart text classification")
The console output includes information about the model, including the model version ID:
...
BasetenDeployedModel<
model_version_id=qzk76xq
name=Quickstart text classification
>
Once the model is deployed, you'll see a green "Active" badge (and receive an email notification).

An active model
You can invoke the model via the Python client:
model = baseten.deployed_model_version_id("$MODEL_VERSION_ID") # $MODEL_VERSION_ID is from the previous step
model.predict("I am very happy that I just deployed this model!")
You'll receive the model output as a list:
[{'label': 'POSITIVE', 'score': 0.9996334314346313}]
Last modified 2mo ago