Skip to main content
These examples cover a variety of use cases on Baseten, from deploying your first LLM and image generation to transcription, embeddings, and RAG pipelines. Whether you’re optimizing inference with TensorRT-LLM or deploying a model with Truss, these guides help you build and scale efficiently.

Choosing the right engine

Not sure which engine to use? Check out our engine documentation to:
  • Select the appropriate engine for your model architecture (embeddings, dense LLMs, or MoE models)
  • Understand performance trade-offs between different engine options
  • Configure advanced features like quantization and speculative decoding
  • Optimize for your specific use case with engine-specific guidance

Model library

For a quick start, explore the model library with prebuilt, ready to deploy in one click models like DeepSeek, Llama, and Qwen.

Training

Train and fine-tune models with Baseten’s scalable training infrastructure. From fine-tuning large language models to training custom models, our platform provides the tools and compute you need. Our training infrastructure supports popular frameworks including VERL, Megatron, and Unsloth, as well as models trained directly with Hugging Face Transformers.