Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.baseten.co/llms.txt

Use this file to discover all available pages before exploring further.

By the end of this page you have a checkpoint stored in Baseten that you can list, download, or hand to an inference deployment. The base model throughout is Qwen/Qwen3-8B. Before you start, export two environment variables: BASETEN_API_KEY (a workspace key with org access to Loops) and TRAINERS_PROJECT_ID (the ID of the training project you’re targeting).

Prerequisites

  • A Baseten workspace API key with org access to Loops, exported as BASETEN_API_KEY. See API keys.
  • A training project ID, exported as TRAINERS_PROJECT_ID.
  • Python 3.10+ and uv.

1. Install

The main client package is baseten-loops, on PyPI. The Tinker compatibility package, tinker-loops, ships separately and re-exports the public API under the tinker namespace, so existing import tinker scripts run unchanged. Add both to your project:
uv add baseten-loops tinker-loops
Verify the install:
import tinker
print(tinker.__version__)

2. Provision a trainer

A Loops session pairs a trainer server (forward, backward, and optimizer steps) with a sampling server (generates from current weights). Constructing a ServiceClient and calling create_lora_training_client provisions both in one shot and returns clients you can drive directly. Cold start typically takes about five minutes. Save this as provision.py:
import os
import tinker

PROJECT_ID = os.environ["TRAINERS_PROJECT_ID"]
BASE_MODEL = "Qwen/Qwen3-8B"

service_client = tinker.ServiceClient(PROJECT_ID)
training_client = service_client.create_lora_training_client(
    base_model=BASE_MODEL,
    rank=16,
    max_seq_len=8192,
)

print(f"session_id={service_client.session_id}")
print(f"trainer_server_id={training_client.trainer_server_id}")
Run it:
uv run python provision.py
When the call returns, both servers are healthy and ready to receive training calls. Save the printed trainer_server_id; you’ll use it in step 4.

3. Run a training round trip

The smallest complete round trip is one forward pass, one backward pass, one optimizer step, and one weight save. The script below mirrors the canonical SFT example: it tokenizes a prompt-and-answer pair, masks the prompt positions from the loss, runs the round trip, and saves a named checkpoint. Save this as train.py:
import os
import tinker

PROJECT_ID = os.environ["TRAINERS_PROJECT_ID"]
BASE_MODEL = "Qwen/Qwen3-8B"

service_client = tinker.ServiceClient(PROJECT_ID)
training_client = service_client.create_lora_training_client(
    base_model=BASE_MODEL,
    rank=16,
    max_seq_len=8192,
)

def build_sft_datum(tokenizer, prompt, answer):
    p = tokenizer.encode(prompt, add_special_tokens=False)
    a = tokenizer.encode(answer, add_special_tokens=False)
    tokens = p + a
    targets = [-100] * len(p) + list(a)  # mask prompt, keep answer
    return tokens, targets

tokens, targets = build_sft_datum(
    training_client.get_tokenizer(),
    prompt="What is the capital of France?\nAnswer:",
    answer=" Paris",
)
datum = tinker.Datum(
    model_input=tinker.ModelInput.from_ints(tokens),
    loss_fn_inputs={
        "target_tokens": tinker.TensorData(
            data=targets, dtype="int64", shape=[len(targets)]
        )
    },
)

fb = training_client.forward_backward(data=[datum]).result(timeout=600.0)
print(f"loss={fb.loss:.6f}")

optim = training_client.optim_step(
    tinker.AdamParams(learning_rate=4e-5)
).result(timeout=600.0)
print(f"optim_metrics={optim.metrics}")

sampling_client = training_client.save_weights_and_get_sampling_client(
    name="step-1"
).result(timeout=600.0)

print(f"trainer_server_id={training_client.trainer_server_id}")
Run it:
uv run python train.py
When save_weights_and_get_sampling_client returns, the weights are committed as a named checkpoint and the sampling server is loaded with the new version.

4. List checkpoints

Every save_weights_and_get_sampling_client call creates a checkpoint. List them with the SDK to get checkpoint IDs and metadata:
import os
import tinker

service_client = tinker.ServiceClient(os.environ["TRAINERS_PROJECT_ID"])
checkpoints = service_client.list_checkpoints(
    trainer_server_id=os.environ["TRAINER_ID"]
)
for ckpt in checkpoints:
    print(ckpt.id, ckpt.name)
The same listing is available from the HTTP API for scripts and CI pipelines that don’t run Python:
curl --request GET \
  --url "https://api.baseten.co/v1/loops/checkpoints?run_id=${RUN_ID}" \
  --header "Authorization: Api-Key $BASETEN_API_KEY"
The HTTP API calls trainer servers “runs”; RUN_ID here is the same value the SDK exposes as trainer_server_id. The response includes a list of checkpoint objects, each with an id, a name (the string you passed to save_weights_and_get_sampling_client), and a creation timestamp. Pass a checkpoint id to get_checkpoint_archive_url to retrieve paginated presigned URLs for the weight files.

Next steps

The Loops concepts page explains the paired-process model in detail: how sessions own trainer and sampling servers, how weight sync works, and how checkpoints land as unzipped folders of paginated presigned URLs rather than single archives. Reading it will make the resource IDs in this quickstart feel less arbitrary. If you’re migrating from Tinker, the Tinker compatibility page documents what carries over exactly (forward, backward, optim step, sampling, data types) and what behaves differently (checkpoint layout, authentication, cluster routing). The import tinker path used here already covers most cookbook recipes; that page names the three places where behavior has changed. When you’re ready to call the HTTP API directly (for scripting deployments, fetching checkpoint files programmatically, or integrating Loops into a CI pipeline), the Loops API overview covers each route’s path, request body, response shape, and authentication scope in one place.