Some Python dependencies require system-level packages to function. Truss allows you to specify APT-installable Debian packages in config.yaml.

Adding System Packages

Specify system dependencies under system_packages:

config.yaml
system_packages:
- tesseract-ocr

Example: LayoutLM Document QA

LayoutLM Document QA is a model that requires tesseract-ocr for text recognition. Below is a minimal setup for deploying it with Truss.

1. Initialize Truss

truss init layoutlm-document-qa && cd layoutlm-document-qa

2. Implement the Model Class

Define the model in model/model.py:

model/model.py
from transformers import pipeline

class Model:
    def __init__(self, **kwargs) -> None:
        self._model = None

    def load(self):
        self._model = pipeline(
            "document-question-answering",
            model="impira/layoutlm-document-qa",
        )

    def predict(self, model_input):
        return self._model(model_input["url"], model_input["prompt"])

3. Set Dependencies

Add Python and system dependencies in config.yaml:

config.yaml
requirements:
  - Pillow==10.0.0
  - pytesseract==0.3.10
  - torch==2.0.1
  - transformers==4.30.2

system_packages:
  - tesseract-ocr

TIP: Always pin exact package versions to avoid breaking changes.

4. Configure Model Resources

One of the Python dependencies, pytesseract, also requires a system package to operate.

Adding system packages works just like adding Python requirements. You can specify any package that’s available via apt on Debian.

config.yaml
resources:
  cpu: "4"
  memory: 16Gi
  use_gpu: false
  accelerator: null

5. Deploy and Invoke

config.yaml
truss push
truss predict -d '{"url": "https://templates.invoicehome.com/invoice-template-us-neat-750px.png", "prompt": "What is the invoice number?"}'

You’ll need a Baseten API key for this step.

We have successfully packaged LayoutLM Document QA as a Truss. Let’s deploy!

Was this page helpful?