A Loops session pairs a trainer server with a sampling server so that trained weights move to the sampler as soon as they exist. The trainer runs forward, backward, and optimizer steps; the sampler generates from current weights. Both live inside the same session and share a weight-sync path from the moment you provision them. Unlike offline training, where you finish a run, save a checkpoint, and then reload weights into a separate inference process, Loops keeps the sampler in sync throughout. When the trainer saves weights, the sampling server picks them up without restarting. The sampler you query at step 100 is running the same weights the trainer just committed.Documentation Index
Fetch the complete documentation index at: https://docs.baseten.co/llms.txt
Use this file to discover all available pages before exploring further.
Sessions
A Loops session is the container resource that scopes a training project’s work. It holds the trainer server and sampling server for a given base model and links them to a Baseten training project. Everything you create within a session (trainer servers, sampling servers, checkpoints) is queryable through that session’s ID. For the full route reference, see the Loops API overview.Trainer servers
A trainer server is the process that runs the training computation: forward pass, backward pass, and optimizer step. It owns the model weights for the duration of the session and writes checkpoints to a dedicated storage path under abt://loops:… URI. There is one trainer per session per base model. You set max_seq_len when creating the trainer; Baseten picks the GPU type, GPU count, and node topology (single-node or multi-node) that supports your sequence length and base model. When you call POST /v1/loops/runs, Baseten provisions the trainer server alongside its paired sampling server and returns both resource IDs. The HTTP API calls trainer servers “runs”; the SDK calls them “trainer servers”. Both refer to the same resource and ID.
Sampling servers
A sampling server runs inference from the trainer’s current weights. It’s provisioned alongside the trainer and linked to it at creation time. The sampler receives new weights through the weight-sync runtime whenever the trainer saves them. See How weight sync works for the mechanism. Because the sampler doesn’t restart during a session, generation latency stays low even as weights change, and you can interleave training steps and rollout calls without coordinating reloads.Checkpoints
Every time the trainer saves weights, Loops creates a checkpoint identified by abt://loops:<run_id>/(weights|sampler_weights)/<checkpoint_name> URI. The URI encodes the run ID, the checkpoint target (trainer weights or sampler weights), and the checkpoint name, for example, bt://loops:k4q95w5/weights/step-100. You pass this URI to create a trainer or sampler server from a prior checkpoint, or to deploy weights to inference.
Checkpoints are stored as folders on disk, not as single archives. Listing checkpoint files returns a paginated response of presigned URLs, one URL per file in the folder, controlled by page_size and page_token query parameters. This differs from Tinker’s single-archive download shape: Tinker returns one URL you download and unpack; Loops returns a page of per-file URLs you fetch individually. If your client code unpacks a Tinker archive today, you’ll need to adapt it to iterate the paginated file list instead. The route is GET /v1/loops/checkpoints/{checkpoint_id}/files, documented in the Loops API overview.
Deployments
To ship a Loops checkpoint to a Baseten inference deployment, runtruss train deploy_checkpoints --run-id <id>. The --run-id flag takes the Loops run ID (the same value the SDK exposes as trainer_server_id). Pass the checkpoint name alongside it to target a specific save. For all flags and options, see the Truss Train CLI reference.