Create a model via the REST API

The management API exposes the same archive-based push that truss push runs under the hood. Use it from non-Python clients (Go, JavaScript) or from CI, where you can’t run the Python Truss CLI. For most workflows, truss push is simpler. The flow has three calls:

POST /v1/prepare_model_upload validates the payload and returns temporary credentials scoped to an S3 location.
Upload your Truss archive to that location.
POST /v1/models commits the upload as a new model. To add a deployment to an existing model, call POST /v1/models/{model_id}/deployments instead.

Prepare the upload

Send the parsed Truss config and a model name, loading weights through the Baseten Delivery Network with a weights block (source accepts hf://, s3://, gs://, and bdn:// URIs). Set dry_run to true to validate without issuing credentials:

curl https://api.baseten.co/v1/prepare_model_upload \
  -H "Authorization: Bearer $BASETEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-model",
    "deployment": {
      "config": { "model_name": "my-model", "resources": { "accelerator": "A10G" }, "weights": [{ "source": "hf://meta-llama/Llama-3.1-8B@main", "mount_location": "/models/llama" }] }
    }
  }'

The response carries the upload credentials and the S3 location to upload to:

{
  "creds": {
    "aws_access_key_id": "ASIA...",
    "aws_secret_access_key": "...",
    "aws_session_token": "..."
  },
  "s3_bucket": "baseten-user-models-xxxx",
  "s3_key": "organizations/.../models/.../model.tgz",
  "s3_region": "us-west-2"
}

To add a deployment to an existing model instead of creating a new one, send model_id rather than name. Exactly one of the two is required.

Upload the archive

Package your Truss as a gzipped tar archive, then upload it to the returned s3_bucket and s3_key using the temporary credentials:

import boto3

# resp is the JSON returned by the prepare step
creds = resp["creds"]
session = boto3.Session(
    aws_access_key_id=creds["aws_access_key_id"],
    aws_secret_access_key=creds["aws_secret_access_key"],
    aws_session_token=creds["aws_session_token"],
    region_name=resp["s3_region"],
)
session.client("s3").upload_file("model.tgz", resp["s3_bucket"], resp["s3_key"])

Create the model

Commit the upload with source.kind set to model_archive, the same deployment payload you validated, and the s3_key from the prepare step:

curl https://api.baseten.co/v1/models \
  -H "Authorization: Bearer $BASETEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source": {
      "kind": "model_archive",
      "name": "my-model",
      "s3_key": "organizations/.../models/.../model.tgz",
      "deployment": {
        "config": { "model_name": "my-model", "resources": { "accelerator": "A10G" }, "weights": [{ "source": "hf://meta-llama/Llama-3.1-8B@main", "mount_location": "/models/llama" }] }
      }
    }
  }'

The response returns the created model and its first deployment:

{
  "model": { "id": "abcd123", "name": "my-model" },
  "deployment": { "id": "1q2w3e4", "status": "BUILDING" }
}

Wait for the deployment

The deployment isn’t ready when the call returns. Poll GET /v1/models/{model_id}/deployments/{deployment_id} until its status is ACTIVE.

Create a model via the REST API

Prepare the upload

Upload the archive

Create the model

Wait for the deployment

Next steps

Prepare a model upload

Create a model from a source

​Prepare the upload

​Upload the archive

​Create the model

​Wait for the deployment

​Next steps

Prepare a model upload

Create a model from a source

Prepare the upload

Upload the archive

Create the model

Wait for the deployment

Next steps