Getting Started
Building your first Truss
View on Github
In this example, we go through building your first Truss model. Weβll be using the HuggingFace transformers library to build a text classification model that can detect sentiment of text.
Step 1: Implementing the model
Set up imports for this model. In this example, we simply use the HuggingFace transformers library.
Every Truss model must implement a Model
class. This class must have:
- an
__init__
function - a
load
function - a
predict
function
In the __init__
function, set up any variables that will be used in the load
and predict
functions.
In the load
function of the Truss, we implement logic
involved in downloading the model and loading it into memory.
For this Truss example, we define a HuggingFace pipeline, and choose
the text-classification
task, which uses BERT for text classification under the hood.
Note that the load function runs once when the model starts.
In the predict
function of the Truss, we implement logic related
to actual inference. For this example, we just call the HuggingFace pipeline
that we set up in the load
function.
Step 2: Writing the config.yaml
Each Truss has a config.yaml file where we can configure options related to the deployment. Itβs in this file where we can define requirements, resources, and runtime options like secrets and environment variables
Basic Options
In this section, we can define basic metadata about the model, such as the name, and the Python version to build with.
Set up python requirements
In this section, we define any pip requirements that we need to run the model. To run this, we need PyTorch and Tranformers.
Configure the resources needed
In this section, we can configure resources needed to deploy this model. Here, we have no need for a GPU so we leave the accelerator section blank.
Other config options
Truss also has provisions for adding other runtime options packages. In this example, we donβt need these, so we leave this empty for now.
Step 3: Deploying & running inference
Deploy the model with the following command:
And then you can performance inference with: