Model with system packages
Deploy a model with both Python and system dependencies
Summary
To add system packages to your model serving environment, open config.yaml
and update the system_packages
key with a list of apt-installable Debian packages.
Example code:
system_packages:
- tesseract-ocr
Step-by-step example
LayoutLM Document QA is a multimodal model that answers questions about provided invoice documents.
The model requires a system package, tesseract-ocr
, which we need to include in the model serving environment.
You can see the code for the finished LayoutLM Document QA Truss on the right. Keep reading for step-by-step instructions on how to build it.
This example will cover:
- Implementing a
transformers.pipeline
model in Truss - Adding Python requirements to the Truss config
- Adding system requirements to the Truss config
- Setting sufficient model resources for inference
Step 0: Initialize Truss
Get started by creating a new Truss:
truss init layoutlm-document-qa
Give your model a name when prompted, like LayoutLM Document QA
. Then, navigate to the newly created directory:
cd layoutlm-document-qa
Step 1: Implement the Model
class
LayoutLM Document QA is a pipeline model, so it is straightforward to implement in Truss.
In model/model.py
, we write the class Model
with three member functions:
__init__
, which creates an instance of the object with a_model
propertyload
, which runs once when the model server is spun up and loads thepipeline
modelpredict
, which runs each time the model is invoked and handles the inference. It can use any JSON-serializable type as input and output.
Read the quickstart guide for more details on Model
class implementation.
from transformers import pipeline
class Model:
def __init__(self, **kwargs) -> None:
self._model = None
def load(self):
# Load the model from Hugging Face
self._model = pipeline(
"document-question-answering",
model="impira/layoutlm-document-qa",
)
def predict(self, model_input):
# Invoke the model and return the results
return self._model(
model_input["url"],
model_input["prompt"]
)
Step 2: Set Python dependencies
Now, we can turn our attention to configuring the model server in config.yaml
.
In addition to transformers
, LayoutLM Document QA has three other dependencies. We list them below as follows:
requirements:
- Pillow==10.0.0
- pytesseract==0.3.10
- torch==2.0.1
- transformers==4.30.2
Always pin exact versions for your Python dependencies. The ML/AI space moves fast, so you want to have an up-to-date version of each package while also being protected from breaking changes.
Step 3: Install system packages
One of the Python dependencies, pytesseract
, also requires a system package to operate.
Adding system packages works just like adding Python requirements. You can specify any package that’s available via apt
on Debian.
system_packages:
- tesseract-ocr
Step 4: Configure model resources
LayoutLM Document QA doesn’t require a GPU, but you’ll need a midrange CPU instance if you want reasonably fast invocation times. 4 CPU cores and 16 GiB of RAM is sufficient for the model.
Model resources are also set in config.yaml
and must be specified before you deploy the model.
resources:
cpu: "4"
memory: 16Gi
use_gpu: false
accelerator: null
Step 5: Deploy the model
You’ll need a Baseten API key for this step.
We have successfully packaged LayoutLM Document QA as a Truss. Let’s deploy!
truss push
You can invoke the model with:
truss predict -d '{"url": "https://templates.invoicehome.com/invoice-template-us-neat-750px.png", "prompt": "What is the invoice number?"}'