Config options
Set your model resources, dependencies, and more
Truss is configurable to its core. Every Truss must include a file config.yaml in its root directory, which is automatically generated when the Truss is created. However, configuration is optional. Every configurable value has a sensible default, and a completely empty config file is valid.
Example
Hereβs an example config file for a Truss that uses the WizardLM model:
Full config reference
model_name
Name of your model
description
Describe your model for documentation purposes.
model_class_name
(default: Model
)
The name of the class that defines your Truss model. Note that this class must implement
at least a predict
method.
model_module_dir
(default: model
)
Folder in the Truss where to find the model class.
data_dir
(default: data/
)
Folder where to place data files in your Truss. Note that you can access this within your model like so:
packages
(default: packages/
)
Folder in the Truss to put your custom packages.
Inside the packages
folder you can place your own code that you want to reference inside model.py
. Here is an example:
Imagine you have the project setup below:
Inside the model.py
the package can be imported like this:
external_package_dirs
External package dirs enable a truss to access custom packages located outside of it. This is a convenient way to allow multiple trusses to access the same package.
Here is an example project structure. Note that the external package, super_cool_awesome_plugin/
is outside the truss.
Inside the stable-diffusion/config.yaml
the path to your external package needs to be specified. For the example above, the config.yaml
would look like this:
Inside stable-diffusion/model/model.py
the super_cool_awesome_plugin/
package can be imported like so:
environment_variables
Do not store secret values directly in environment variables (or anywhere in the config file). See the secrets
arg for information on properly managing secrets.
Any environment variables can be provided here as key value pairs and are exposed to the environment that the model executes in. Many Python libraries can be customized using environment variables, so this field can be quite handy in those scenarios.
model_metadata
Set any additional metadata in this catch-all field. The entire contents of the config file are available to the model at runtime, so this is a good place to store any custom information that model needs. For example, scikit-learn models include a flag here that indicates whether the model supports returning probabilities alongside predictions.
This is also where display metdata can be stored
requirements_file
Path the requirements file with the required Python dependencies.
We strongly recommend pinning versions in your requirements.
requirements
List the Python dependencies that the model depends on. The requirements should be provided in the pip requirements file format, but as a yaml list. These requirements are installed after the ones from requirements_file
.
We strongly recommend pinning versions in your requirements.
resources
The resources
section is where you specify the compute resources that your model needs. This includes CPU, memory, and GPU resources.
If you need a GPU, you must also set resources.use_gpu
to true
.
resources.cpu
CPU resources needed, expressed as either a raw number, or "millicpus". For example, 1000m
and 1
are equivalent.
Fractional CPU amounts can be requested using millicpus. For example, 500m
is half of a CPU core.
resources.memory
CPU RAM needed, expressed as a number with units. Units acceptable include "Gi" (Gibibytes), "G" (Gigabytes), "Mi" (Mebibytes), and"M" (Megabytes). For example, 1Gi
and 1024Mi
are equivalent.
resources.use_gpu
Whether or not a GPU is required for this model.
resources.accelerator
Which GPU you would like for your instance. Available Nvidia GPUs supported in Truss include:
T4
L4
A10G
V100
A100
H100
H100_40GB
See details
Note that if your model requires multiple GPUs (i.e. the weights don't fit in a single GPU), you can use the :
operator to request multiple GPUs on your instance, eg:
secrets
This field can be used to specify the keys for such secrets and dummy default values. Never store actual secret values in the config. Dummy default values are instructive of what the actual values look like and thus act as documentation of the format.
A model may depend on certain secret values that can't be bundled with the model and need to be bound securely at runtime. For example, a model may need to download information from s3 and may need access to AWS credentials for that.
system_packages
Specify any system packages that you would typically install using apt
on a Debian operating system.
python_version
Which version of Python you'd like to use. Supported versions include:
- py39
- py310
- py311
base_image
The base_image
option is used if you need to bring your own custom base image.
Custom base images are useful if there are scripts that need to run at build time, or dependencies
that are complicated to install. After creating a custom base image, you can specify it
in this field.
See Custom Base Images for more detail on how to use these.
base_image.image
A path to the docker image you'd like to use, as
an example, nvcr.io/nvidia/nemo:23.03
.
base_image.python_executable_path
A path to the Python executable on the image. For instance, /usr/bin/python
.
Tying it together, a custom base image configuration might look like this:
base_image.docker_auth
Docker registry authentication options, in the case that you want to use a base image hosted on a private registry.
base_image.docker_auth.auth_method
Enum, representing what authentication method you'd like to use to authenticate to the private registry. Currently supports:
GCP_SERVICE_ACCOUNT_JSON
- authenticate with a GCP service account. To use this, make sure you add your service account JSON blob as a Truss secret.
You specify this in the docker_auth
settings like so:
Note that here, secret_name
references the secret that you added your service account json to.
base_image.docker_auth.secret_name
The Truss secret that stores the credential that you will be authenticating with.
Ensure that this secret is added to the secrets
section of your Truss, and that you deploy your model
with the --trusted
flag.
base_image.docker_auth.registry
The registry to authenticate to (ie: us-east4-docker.pkg.dev
).
runtime
Runtime settings for your model instance.
runtime.predict_concurrency
(default: 1
)
This field governs how much concurrency can run in the predict method of your model. This is useful
if you have a model that has support for parallelism, and you'd like to take advantage of that.
By default, this value is set to 1, implying that predict
can only run for one request at a time.
This protects the GPU from being over-utilized, and is a good default for many models.
See How to configure concurrency for more detail on how to set this value.
runtime.enable_tracing_data
(default: False
)
Enables trace data export with builtin OTEL instrumentation. If not further specified, this data is only collected baseten-internally and can help us troubleshoot. You can additionally export it to your own systems. Refer to the tracing guide Turning this on, could add performance overhead.
runtime.enable_debug_logs
(default: False
)
If turned on, the log level for the Truss server is changed from INFO
to
DEBUG
.
external_data
Use external_data
if you have data that you want to be bundled in your image at build time.
This is useful if you have a large amount of data that you want to be available to your model.
By including it at build-time, you reduce the cold-start time of your instance, as the data is
already available in the image. You can use it like so:
external_data.<list_item>.url
The URL to download data from.
external_data.<list_item>.local_data_path
The path on the image where the data will be downloaded to.
external_data.<list_item>.name
You can set a name for the data, which is useful for readability-purposes. Not required.
build_commands
A list of commands to run at build time. Useful for performing one-off bash commands. For instance, if you wanted to git clone a repository, you could do so by specifying that command:
build
The build
section is used to define options for builds.
build.secret_to_path_mapping
This option grants access to secrets during the build. Here, you provide a mapping between a secret and a path on the image.
You can then access the secret in commands specified in the build_commands
section by running cat
on the file. For instance,
to install a pip package from a private Github repository, you could do the following:
Here, we specify that the Truss has access to a secret called my-github-access-token
, and that it can be accessed at the path
/root/my-github-access-token
. In our build_commands
section, we then access that secret through using cat
to get access
to the contents.
Under the hood, this option mounts your secret as a build secret. This means that the value of your secret will be secure and will not be exposed via Docker history or logs.
This requires setting the --trusted
flag when deploying your model,
and having a secret named my-github-access-token
stored on Baseten.
model_cache
The model_cache
section is used for caching model weights at build-time. This is one of the biggest levers
for decreasing cold start times, as downloading weights can be one of the lengthiest parts of starting a new
model instance. Using this section ensures that model weights are cached at build time.
See the model cache guide for the full details on how to use this field.
Despite the fact that this field is called the model_cache
, there are multiple backends supported, not just Hugging Face. You can
also cache weights stored on GCS, for instance.
model_cache.<list_item>.repo_id
The endpoint for your cloud bucket. Currently, we support Hugging Face and Google Cloud Storage.
Example: madebyollin/sdxl-vae-fp16-fix
for a Hugging Face repo, or gcs://path-to-my-bucket
for
a GCS bucket.
model_cache.<list_item>.revision
Points to your revision. By default, it refers to main
.
model_cache.<list_item>.allow_patterns
Only cache files that match specified patterns. Utilize Unix shell-style wildcards to denote these patterns. By default, all paths are included.
model_cache.<list_item>.ignore_patterns
Conversely, you can also denote file patterns to ignore, hence streamlining the caching process. By default, nothing is ignored.