Baseten’s model library is the fastest way to get started with dozens of popular foundation models like Mistral 7B, Stable Diffusion XL and Whisper.

Deploy a library model

You can deploy any model library model in just a couple of clicks.

Models deployed from the web UI are published and trusted by default, so they have full access to autoscaling options and workspace secrets.

Model library inference guide

Models in the model library follow a set of standards for consistency and predictable performance.

Configuration standards

  • Models use the least expensive instance type that will reliably run the model. For example, Stable Diffusion XL uses an A10-based instance even though an A100-based instance is faster. You can adjust the instance type in the model dashboard after deployment.
  • Models like Llama 2 that rely on authentication with Hugging Face require the secret hf_access_token to be set in your account secrets.

Input standards

  • All models take a dictionary as input.
  • For models like LLMs and Stable Diffusion that take a text input, the key for that input is prompt.
  • Transformers-based models like LLMs support arguments from the transformers generationConfig object.

Output standards

  • Any generated image or audio will be returned using base64 encoding.
  • All LLMs support streaming. To stream a response, pass "stream": true in the input dictionary.