Training checkpoint deployment
Deploy fine-tuned models from Baseten Training with Engine-Builder-LLM. SpecifyBASETEN_TRAINING as the source:
config.yaml
base_model:decoderfor LLMs,encoder/encoder_bertfor embeddingssource:BASETEN_TRAININGfor Baseten Training checkpointsrepo: Your training job IDrevision: Checkpoint folder name (e.g.,checkpoint-100,checkpoint-final)
Encoder model requirements
To deploy a fine-tuned encoder model (embeddings, rerankers) from training checkpoints, useencoder or encoder_bert as the base model:
config.yaml
encoder_bert for BERT-based models (sentence-transformers, classification, reranking). Use encoder for causal embedding models.
Encoder models have specific requirements:
- No tensor parallelism: Omit
tensor_parallel_countor set it to1. - Fast tokenizer required: Your checkpoint must include a
tokenizer.jsonfile. Models using only the legacyvocab.txtformat are not supported. - Embedding model files: For sentence-transformer models, include
modules.jsonand1_Pooling/config.jsonin your checkpoint.
webserver_default_route configures the inference endpoint. Options include /v1/embeddings for embeddings, /rerank for rerankers, and /predict for classification.
Cloud storage deployment
Deploy models directly from S3, GCS, or Azure. Specify the storage source and bucket path:config.yaml
S3: Amazon S3 bucketsGCS: Google Cloud StorageAZURE: Azure Blob StorageHF: Hugging Face repositories
Private storage setup
All runtimes use the same downloader system as model_cache. As a result, you configure theruntime_secret_name and repo identically across model_cache and runtimes like Engine-Builder-LLM or BEI.
Secret Setup:
Add these JSON secrets to your Baseten secrets manager.
For more details, refer to the documentation in model_cache.
S3:
Further reading
- Engine-Builder-LLM configuration: Complete build and runtime options for LLMs.
- BEI reference configuration: Complete configuration for encoder models.
- Model cache documentation: Caching strategies used by the engines.
- Secrets management: Configure credentials for private storage.