Welcome to Baseten!

Bring your models. We'll handle the rest.
Baseten is the machine learning infrastructure platform to serve models of any size and modality and do so performantly, scalably, and affordably for production use cases.
All you need to get started deploying models with Baseten is an API key.
  • Sign in to or sign up for your Baseten account.
  • Generate an API key.
  • In your terminal, run pip install --upgrade baseten
  • Authenticate with baseten login.
To kick off your exploration, every new account comes with free model resource credits.

Deploying models

You can deploy any model on Baseten:
  • If you want to deploy a popular foundation model as-is, it's just two clicks away in the model library.
  • For full control over the deployment and serving of an off-the-shelf model, clone a Truss of a foundation model from Baseten's GitHub.
  • To deploy your own model, package it with Truss, an open-source model packaging library developed by Baseten.
To deploy your model, load your model and call baseten.deploy():
import baseten
import truss
my_model = truss.load("~/path/to/my-model")
model_name="My wonderful model"
Get started deploying models:

Managing models

Once your deploy a model, you can view it on your Baseten dashboard. Use the model page to:

Building with models

Deployed models can be invoked via API call:
curl -X POST \
-H 'Authorization: Api-Key YOUR_API_KEY' \
But there's so much more you can do with your models on Baseten. You can build microservices via worklets or entire applications to test and use your models. And Baseten's hosted Postgres tables and data connections make it seamless to integrate your data with your model.


Use your Baseten account and workspace settings to:


Securing users' data and the infrastructure that runs their code and models is paramount. Each user's workload is isolated in a separate Kubernetes namespaces with strict network security policies on inter-namespace communication as well as Pod security policies enforced and monitored by Gatekeeper and Sysdig.
Baseten strongly encourages the best practice of keeping sensitive data away from code by providing multiple ways to store secrets securely.

Network accelerator

We developed a network accelerator to speed up model loads from common model artifact stores, including HuggingFace, CloudFront, S3, and OpenAI. Our accelerator employs byte range downloads in the background to maximize the parallelism of downloads. If you prefer to disable this network acceleration for your Baseten workspace, please contact our support team at [email protected].


Everyone is invited to email our main support channel — [email protected] — or send us a message by clicking the Support button within the product. These channels are monitored during Pacific Time business hours by senior engineers, Baseten founders, and other in-house product experts.
Workspaces on the Enterprise plan get access to a dedicated forward deployed engineer in a shared Slack channel where their questions are answered within 4 business hours (Pacific Time) with most questions answered within minutes.