model/model.py file. The simplest directory structure is:
model.py file must contain a class with these methods:
model.py
__init__initializes theModelclass. Read configuration parameters and other information here.loadinitializes the model. Download model weights or load them onto a GPU here.predictruns inference.
init
The__init__ method initializes the Model class. Use it to read configuration parameters and runtime information.
The simplest signature for __init__ is:
model.py
model.py
config: A dictionary containing the config.yaml for the model.data_dir: A string containing the path to the data directory for the model.secrets: A dictionary containing the secrets for the model. Note that at runtime, these will be populated with the actual values as stored on Baseten.environment: A string containing the environment for the model, if the model has been deployed to an environment.
model.py
load
Theload method is where you define the logic for initializing the model. As
mentioned before, this might include downloading model weights or loading them
onto the GPU.
load, unlike the other method mentioned, does not accept any parameters:
model.py
load has
completed successfully. Note that there is a timeout of 30 minutes for this, after which,
if load hasn’t completed, the deployment will be marked as failed.
predict
Thepredict method is where you define the logic for performing inference.
The simplest signature for predict is:
model.py
predict must be JSON-serializable, so it can be:
dictliststr
Pydantic object.
model.py
predict:
model.py
Streaming
In addition to supporting a single request/response cycle, Truss also supports streaming. See the Streaming guide for more information.Async vs. Sync
Note that thepredict method is synchronous by default. However, if your model inference
depends on APIs require asyncio, predict can also be written as a coroutine.
model.py