Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.baseten.co/llms.txt

Use this file to discover all available pages before exploring further.

Use the Loops Python SDK to create a LoRA training run, save a checkpoint, and list that checkpoint from Python and HTTP. The base model throughout is Qwen/Qwen3.5-2B, one of the supported base models.

Prerequisites

  • Baseten account: Sign up for Baseten if you don’t have one.
  • Baseten workspace with Loops enabled: Loops is in early access. Fill out the signup form to request access for your workspace.
  • API key with org access to Loops: Generate a workspace API key and export it:
    export BASETEN_API_KEY="paste-your-api-key-here"
    
  • Python 3.13+ and uv: The quickstart uses uv to install the Loops client and run the training script.

Install

The main client package is baseten-loops on PyPI. The Tinker compatibility shim ships as the [tinker] extra (distributed as baseten-loops-tinker) and re-exports the public API under the tinker namespace, so existing import tinker scripts run unchanged. uv add writes the dependency into a project’s pyproject.toml, so create a project first if you don’t already have one:
uv init loops-quickstart
cd loops-quickstart
uv add 'baseten-loops[tinker]'
These commands create the loops-quickstart project, install baseten-loops, and make the tinker namespace available. Verify the install:
import tinker
print(tinker.ServiceClient)
The output is:
<class 'tinker._service_client.ServiceClient'>

Provision a trainer

A Loops session pairs a trainer server (forward, backward, and optimizer steps) with a sampling server (generates from current weights). Constructing a ServiceClient and calling create_lora_training_client provisions both in one shot and returns a TrainingClient you can drive directly. The call blocks until the trainer is ready, which can take several minutes for a fresh base model. Start train_loops.py with the provision step:
import tinker

BASE_MODEL = "Qwen/Qwen3.5-2B"

service_client = tinker.ServiceClient()
training_client = service_client.create_lora_training_client(
    base_model=BASE_MODEL,
    rank=16,
)

print(f"session_id={service_client.session_id}")
print(f"run_id={training_client.run_id}")
You’ll append the training and listing steps to this same file in the next two sections, then run the whole thing once at the end.

Run a training round trip

The smallest complete round trip is one forward pass, one backward pass, one optimizer step, and one weight save. The block below mirrors the canonical SFT example: it tokenizes a prompt-and-answer pair, masks the prompt positions from the loss, runs the round trip, and saves a named checkpoint. Append to train_loops.py:
def build_sft_datum(tokenizer, prompt, answer):
    p = tokenizer.encode(prompt, add_special_tokens=False)
    a = tokenizer.encode(answer, add_special_tokens=False)
    tokens = p + a
    targets = [-100] * len(p) + list(a)  # mask prompt, keep answer
    return tokens, targets

tokens, targets = build_sft_datum(
    training_client.get_tokenizer(),
    prompt="What is the capital of France?\nAnswer:",
    answer=" Paris",
)
datum = tinker.Datum(
    model_input=tinker.ModelInput.from_ints(tokens),
    loss_fn_inputs={
        "target_tokens": tinker.TensorData(
            data=targets, dtype="int64", shape=[len(targets)]
        )
    },
)

fb = training_client.forward_backward(data=[datum]).result(timeout=600.0)
print(f"loss={fb.loss:.6f}")

optim = training_client.optim_step(
    tinker.AdamParams(learning_rate=4e-5)
).result(timeout=600.0)
print(f"optim_metrics={optim.metrics}")

save_resp = training_client.save_state(name="step-1").result(timeout=600.0)
print(f"saved checkpoint at {save_resp.path}")
forward_backward is the first training operation you submit after provisioning. Because create_lora_training_client waits for trainer readiness, this call starts after the trainer can accept work. save_state publishes the trainer-side weights under the name you pass and returns a bt://loops:<run_id>/weights/<name> URI. The paired sampler picks up the new weights asynchronously through the weight-sync runtime.

List checkpoints

Every save_state call creates a checkpoint. The TrainingClient is already bound to your trainer, so listing is one line. Append to train_loops.py:
for ckpt in training_client.list_checkpoints():
    print(ckpt.id, ckpt.checkpoint_id, ckpt.created_at)
Now run the full script:
uv run python train_loops.py
Output values vary. A successful run prints a session ID, run ID, loss, optimizer metrics, saved checkpoint URI, and one listed checkpoint:
session_id=7qrp4v3
run_id=yqvvjjq
loss=8.783478
optim_metrics={'step': 1.0, 'lr': 4e-05, ...}
saved checkpoint at bt://loops:yqvvjjq/weights/step-1
RqglDBV step-1 2026-05-26 20:23:19.148000+00:00
The same listing is available from the HTTP API for scripts and CI pipelines that don’t run Python. Use the run_id your script printed when provisioning:
curl --request GET \
  --url "https://api.baseten.co/v1/loops/checkpoints?run_id=<run_id>" \
  --header "Authorization: Bearer $BASETEN_API_KEY"
The response includes the same globally unique id and checkpoint name:
{
  "checkpoints": [
    {
      "id": "RqglDBV",
      "checkpoint_id": "step-1",
      "run_id": "yqvvjjq",
      "target": "trainer",
      "created_at": "2026-05-26T20:23:19.148Z"
    }
  ]
}
Each checkpoint also carries metadata for the base model, size, adapter config, and sync status. To fetch the actual weight files, pass the globally unique id value as the checkpoint_id argument: training_client.get_checkpoint_archive_url(checkpoint_id). From a separate Python session where training_client isn’t in scope, construct tinker.ServiceClient() and call service_client.get_checkpoint_archive_url(checkpoint_id) instead.

Skip the cold start on re-runs

Your first run provisioned a trainer and sampler. The second run doesn’t have to. Grab the session_id your script printed (session_id=7qrp4v3 in the example output above), point the next run at it, and Loops reuses the same trainer and sampler:
export LOOPS_REUSE_FROM_SESSION_ID=7qrp4v3
uv run python train_loops.py
You can also pass the ID directly in code, which wins if both the kwarg and the environment variable are set:
service_client = tinker.ServiceClient(reuse_from_session_id="7qrp4v3")
From the HTTP API, send reuse_from_session_id in the body of POST /v1/loops/runs or POST /v1/loops/samplers. Reuse is best-effort. If the prior trainer is stopped, failed, or unhealthy, Loops provisions a fresh one and your script still runs.

Next steps

To turn any of these checkpoints into an inference endpoint, run truss loops checkpoints deploy --run-id <run_id> and pick a checkpoint interactively, or pass --checkpoint-ids to deploy specific ones. See the checkpoints deploy CLI reference for the full option set. Read Loops concepts to understand the paired-process model before you build longer training workflows: how sessions own trainer and sampling servers, how weight sync works, and how checkpoints land as unzipped folders of paginated presigned URLs rather than single archives. If you’re migrating from Tinker, the Tinker compatibility page documents what carries over exactly (forward, backward, optim step, sampling, data types) and what behaves differently (checkpoint layout, authentication, cluster routing). The import tinker path used here already covers most cookbook recipes; that page names the three places where behavior has changed. When you’re ready to call the HTTP API directly (for scripting deployments, fetching checkpoint files programmatically, or integrating Loops into a CI pipeline), the Loops API overview covers each route’s path, request body, response shape, and authentication scope in one place. To generate from a checkpoint instead of publishing it only, swap save_state for save_weights_and_get_sampling_client, which publishes weights and returns a SamplingClient pinned to the new version.