Engine Builder Models
Engine Builder models are pre-trained models that are optimized for specific inference tasks.
Baseten’s Engine Builder enables the deployment of optimized model inference engines. Currently, it supports TensorRT-LLM. Truss Chains allows seamless integration of these engines into structured workflows. This guide provides a quick entry point for Chains users.
LLama 7B Example
Use the EngineBuilderLLMChainlet
baseclass to configure an LLM engine. The additional engine_builder_config
field specifies model architecture, repository, and runtime parameters and more, the full options are detailed in the Engine Builder configuration guide.
Differences from Standard Chainlets
- No
run_remote
implementation: Unlike regular Chainlets,EngineBuilderLLMChainlet
does not require users to implementrun_remote()
. Instead, it automatically wires into the deployed engine’s API. All LLM Chainlets have the same function signature:chains.EngineBuilderLLMInput
as input and a stream (AsyncIterator
) of strings as output. LikewiseEngineBuilderLLMChainlet
s can only be used as dependencies, but not have dependencies themselves. - No
run_local
(guide) andwatch
(guide) Standard Chains support a local debugging mode and watch. However, when usingEngineBuilderLLMChainlet
, local execution is not available, and testing must be done after deployment. For a faster dev loop of the rest of your chain (everything except the engine builder chainlet) you can substitute those chainlets with stubs like you can do for an already deployed truss model [guide].
Integrate the Engine Builder Chainlet
After defining an EngineBuilderLLMInput
like Llama7BChainlet
above, you can use it as a dependency in other conventional chainlets:
Was this page helpful?