Skip to main content
Use the Loops Python SDK to create a LoRA training run, save a checkpoint, and list that checkpoint from both Python and the HTTP API. The base model throughout is Qwen/Qwen3.5-2B, one of the supported base models.

Prerequisites

  • Python 3.12+ and uv: The quickstart uses uv to install the Loops client and run the training script.
  • API key: A workspace API key with org access to Loops, exported as BASETEN_API_KEY.
Loops is in early access. To enable it for your workspace, fill out the signup form.

Install

Install baseten-loops with the [tinker] extra into a uv project. Create one first if you don’t have it:
uv init loops-quickstart
cd loops-quickstart
uv add 'baseten-loops[tinker]'
The [tinker] extra pulls in baseten-loops-tinker, which re-exports the public API under the tinker namespace so existing import tinker scripts run unchanged. Verify the install by running uv run python train_loops.py:
import tinker
from importlib.metadata import version

print(tinker.ServiceClient)
print("baseten-loops-tinker", version("baseten-loops-tinker"))
The printed class path and resolved baseten-loops-tinker version confirm Baseten’s Tinker compatibility package is installed, not the upstream tinker package.

Provision a trainer

A Loops session pairs a trainer server (forward, backward, and optimizer steps) with a sampling server (generates from current weights). Constructing a ServiceClient and calling create_lora_training_client() provisions both and returns a TrainingClient. The call blocks until the trainer is ready, which can take several minutes for a fresh base model. Start train_loops.py with the provision step:
train_loops.py
import tinker

BASE_MODEL = "Qwen/Qwen3.5-2B"

service_client = tinker.ServiceClient()
training_client = service_client.create_lora_training_client(
    base_model=BASE_MODEL,
    rank=16,
)

print(f"session_id={service_client.session_id}")
print(f"run_id={training_client.run_id}")
You’ll append the training and listing steps to this same file in the next two sections, then run the whole thing once at the end.

Run a training round trip

The smallest complete round trip is one forward pass, one backward pass, one optimizer step, and one weight save. The block below mirrors the canonical supervised fine-tuning (SFT) example: it tokenizes a prompt-and-answer pair, masks the prompt positions from the loss, runs the round trip, and saves a named checkpoint. Append to train_loops.py:
train_loops.py
def build_sft_datum(tokenizer, prompt, answer):
    p = tokenizer.encode(prompt, add_special_tokens=False)
    a = tokenizer.encode(answer, add_special_tokens=False)
    tokens = p + a
    targets = [-100] * len(p) + list(a)  # mask prompt, keep answer
    return tokens, targets

tokens, targets = build_sft_datum(
    training_client.get_tokenizer(),
    prompt="What is the capital of France?\nAnswer:",
    answer=" Paris",
)
datum = tinker.Datum(
    model_input=tinker.ModelInput.from_ints(tokens),
    loss_fn_inputs={
        "target_tokens": tinker.TensorData(
            data=targets, dtype="int64", shape=[len(targets)]
        )
    },
)

fb = training_client.forward_backward(data=[datum]).result(timeout=600.0)
print(f"loss={fb.loss:.6f}")

optim = training_client.optim_step(
    tinker.AdamParams(learning_rate=4e-5)
).result(timeout=600.0)
print(f"optim_metrics={optim.metrics}")

save_resp = training_client.save_state(name="step-1").result(timeout=600.0)
print(f"saved checkpoint at {save_resp.path}")
forward_backward() is the first training operation you submit after provisioning. save_state() publishes the trainer-side weights under the name you pass and returns a bt://loops:<run_id>/weights/<name> URI. The paired sampler picks up the new weights asynchronously through the weight-sync runtime.

List checkpoints

Every save_state() call creates a checkpoint. The bound TrainingClient lists them with no arguments. Append to train_loops.py:
train_loops.py
for ckpt in training_client.list_checkpoints():
    print(ckpt.id, ckpt.checkpoint_id, ckpt.created_at)
Now run the full script. Output values vary, but a successful run prints a session ID, run ID, loss, optimizer metrics, saved checkpoint URI, and one listed checkpoint:
uv run python train_loops.py
session_id=7qrp4v3
run_id=yqvvjjq
loss=8.783478
optim_metrics={'step': 1.0, 'lr': 4e-05, ...}
saved checkpoint at bt://loops:yqvvjjq/weights/step-1
RqglDBV step-1 2026-05-26 20:23:19.148000+00:00
The same listing is available from the HTTP API for scripts and CI pipelines that don’t run Python. Use the run_id your script printed when provisioning. The response includes the same globally unique id and checkpoint name:
curl --request GET \
  --url "https://api.baseten.co/v1/loops/checkpoints?run_id=<run_id>" \
  --header "Authorization: Bearer $BASETEN_API_KEY"
Each checkpoint also carries metadata for the base model, size, adapter config, and sync status. To fetch the weight files, pass the globally unique id value as the checkpoint_id argument: training_client.get_checkpoint_archive_url(checkpoint_id). From a separate Python session where training_client isn’t in scope, construct tinker.ServiceClient() and call service_client.get_checkpoint_archive_url(checkpoint_id) instead.

Skip the cold start on re-runs

Your first run provisioned a trainer and sampler. The second run doesn’t have to. Grab the session_id your script printed (session_id=7qrp4v3 in the example output above), point the next run at it, and Loops reuses the same trainer and sampler:
export LOOPS_REUSE_FROM_SESSION_ID=7qrp4v3
uv run python train_loops.py
You can also pass the ID directly in code, which wins if both the kwarg and the environment variable are set:
service_client = tinker.ServiceClient(reuse_from_session_id="7qrp4v3")
From the HTTP API, send reuse_from_session_id in the body of POST /v1/loops/runs or POST /v1/loops/samplers. Reuse is best-effort. If the prior trainer is stopped, failed, or unhealthy, Loops provisions a fresh one and your script still runs.

Next steps

To turn any of these checkpoints into an inference endpoint, run truss loops checkpoints deploy --run-id <run_id> and pick a checkpoint interactively, or pass --checkpoint-ids to deploy specific ones. See the checkpoints deploy CLI reference for the full option set. Read Loops concepts to understand the paired-process model before you build longer training workflows: how sessions own trainer and sampling servers, how weight sync works, and how checkpoints land as unzipped folders of paginated presigned URLs rather than single archives. If you’re migrating from Tinker, the Tinker compatibility page documents what carries over exactly (forward, backward, optim step, sampling, data types) and what behaves differently (checkpoint layout, authentication, cluster routing). The import tinker path used here already covers most cookbook recipes; that page names the three places where behavior has changed. When you’re ready to call the HTTP API directly (for scripting deployments, fetching checkpoint files programmatically, or integrating Loops into a CI pipeline), the Loops API overview lists every route with its authentication scope, and each route has its own page with the request body, response shape, and an interactive playground. To generate from a checkpoint instead of publishing it only, swap save_state() for save_weights_and_get_sampling_client(), which publishes weights and returns a SamplingClient pinned to the new version.