FAQs
Where are the engines stored?
The engines are stored in Baseten but owned by the user — we’re working on a mechanism for downloading them. In the meantime, reach out if you need access to an engine that you created using the Engine Builder.Does the Engine Builder support quantization?
Yes. The Engine Builder can perform post-training quantization during the building process. For supported options, see quantization in the config reference.Can I customize the engine behavior?
For further control over the TensorRT-LLM engine during inference, use themodel/model.py
file to access the engine object at runtime. See controlling engines with Python for details.