The Loops Python SDK exposes three classes you’ll touch in any training script.Documentation Index
Fetch the complete documentation index at: https://docs.baseten.co/llms.txt
Use this file to discover all available pages before exploring further.
ServiceClient provisions trainer and sampling servers on the Baseten control plane and manages the session that ties them together. TrainingClient runs forward passes, backward passes, and optimizer steps against a live trainer server. SamplingClient generates completions from the current weights, and the version pinning it carries means you always sample from exactly the checkpoint you just trained. These three classes mirror the Tinker shapes; the methods you call are the same names.
Installation
Add the Loops SDK (andtinker-loops, if your existing training code imports from tinker) to your project:
ServiceClient: provision
ServiceClient is the entry point for every session. It calls the Baseten control plane to create a TrainerSession, then provisions trainer and sampling servers within that session on demand.
ServiceClient(training_project_id, *, api_key, base_url)
Construct a ServiceClient and create a new TrainerSession on the Baseten control plane. Pass training_project_id to associate the session with an existing project; pass api_key to authenticate (defaults to the BASETEN_API_KEY environment variable).
ServiceClient.local(*, trainer_url, sampler_url) -> ServiceClient
Bind to already-running local trainer and sampler processes without contacting the control plane. Pass trainer_url and sampler_url as the base URLs of local server processes. Useful for end-to-end testing.
service_client.create_lora_training_client(base_model, rank, max_seq_len, seed, timeout, ready_timeout) -> TrainingClient
Provision a TrainerServer for the given HuggingFace base_model and return a connected TrainingClient. The control plane also provisions a paired sampling server in the same call; save_weights_and_get_sampling_client uses that paired URL to gate on version readiness.
service_client.create_sampling_client(base_model, timeout, ready_timeout) -> SamplingClient
Provision a standalone SamplingServer for base_model and return a connected SamplingClient. Use this when you want to sample from a model independently of a training run. No LoRA adapter is loaded until you push weights.
service_client.list_checkpoints(trainer_server_id) -> list[Checkpoint]
List checkpoints saved by the trainer server identified by trainer_server_id. Calls the Baseten API, not the trainer server directly.
service_client.get_checkpoint_archive_url(trainer_server_id, checkpoint_id, page_size, page_token) -> CheckpointFilesResponse
Return presigned URLs for every file in the specified checkpoint folder. The Loops stack writes checkpoints as unzipped directories rather than archives, so this returns a file list instead of a single archive URL.
service_client.session_id -> str
The session ID assigned by the control plane. Available after construction.
TrainingClient: train
TrainingClient talks directly to a dp_worker instance. Long-running operations use a submit-and-retrieve protocol: the submit fires immediately on the calling thread (so validation errors surface at call time) and .result() long-polls the server until the operation finishes. You can submit multiple operations before awaiting any of them.
TrainingClient.forward_backward(data, loss_fn, loss_fn_config) -> OperationFuture[ForwardBackwardOutput]
Run a forward and backward pass over data (a list of Datum objects) using the specified loss function. Returns an OperationFuture; call .result() to block until the pass completes and retrieve the loss.
TrainingClient.forward(data, loss_fn, loss_fn_config) -> OperationFuture[ForwardBackwardOutput]
Run a forward pass without gradient computation. Same inputs and output shape as forward_backward, but the gradient buffer is left untouched, so it is safe to interleave with gradient accumulation steps.
TrainingClient.optim_step(adam_params) -> OperationFuture[OptimStepResponse]
Apply the accumulated gradients using the Adam optimizer configured by adam_params. Call this after one or more forward_backward calls.
TrainingClient.save_state(name, ttl_seconds) -> OperationFuture[SaveWeightsResponse]
Persist a local training checkpoint under name. When a weight sync URI is configured server-side, also publishes the LoRA adapter so a polling sampler can hot-swap to the new weights.
TrainingClient.save_weights_and_get_sampling_client(name) -> OperationFuture[SamplingClient]
Publish the LoRA adapter to the paired sampling server under name and return a future that resolves to a SamplingClient pinned to the just-published version. Calling .result() runs two stages: the trainer publishes weights, then the SDK polls the sampler until at least one replica reports the new version loaded. The returned SamplingClient carries X-Min-Policy-Version on every subsequent sample() call, so requests only land on replicas that have the right weights.
TrainingClient.list_checkpoints() -> list[Checkpoint]
List checkpoints for the trainer server bound to this client. Requires that this client was constructed via ServiceClient.create_lora_training_client (which populates the necessary session and server IDs automatically).
TrainingClient.get_checkpoint_archive_url(checkpoint_id, page_size, page_token) -> CheckpointFilesResponse
Return presigned URLs for every file in a checkpoint folder. Same semantics as ServiceClient.get_checkpoint_archive_url but scoped to the trainer server this client is bound to.
TrainingClient.get_tokenizer()
Return the HuggingFace PreTrainedTokenizer for the base model. Cached after the first load.
TrainingClient.get_info() -> GetInfoResponse
Return the model configuration for this training session (base model name, LoRA rank, and max sequence length) without a server round-trip.
SamplingClient: sample
SamplingClient generates text completions from the model the sampler currently has loaded. There are two creation paths with different version semantics: ServiceClient.create_sampling_client returns an auto-updating client that follows whatever weights the sampler currently holds, while TrainingClient.save_weights_and_get_sampling_client returns a snapshot client pinned to the version you just trained. Both clients expose the same sample method.
SamplingClient.sample(prompt, num_samples, sampling_params, include_prompt_logprobs, topk_prompt_logprobs) -> SampleResult
Generate num_samples completions from prompt (a ModelInput). Pass a SamplingParams instance to control temperature, top-p, top-k, max tokens, and stop sequences. Set include_prompt_logprobs=True to get per-token log-probabilities for the input tokens alongside the output.
SamplingClient.compute_logprobs(prompt) -> list[float | None]
Return the per-token log-probabilities for prompt without generating any new tokens. Index 0 is always None because the first token has no preceding context to score against.
SamplingClient.discover_base_model_name() -> str
Return the base model ID from the sampler’s /v1/models list, specifically the entry with no parent. Retries with backoff while the sampler is still deploying.
SamplingClient.discover_adapter_name() -> str | None
Return the currently registered LoRA adapter ID, or None if no adapter is loaded. sample() calls this internally when model_name is not set, and caches the result until the adapter changes.
SamplingClient.get_tokenizer()
Return the HuggingFace PreTrainedTokenizer for the base model this client was created with. Raises ValueError if base_model was not set at construction time.
Types
Datum: A single training example: a ModelInput paired with a dict of TensorData loss function inputs.
ModelInput: A tokenized prompt, represented as a list of ModelInputChunk objects. Construct with ModelInput.from_ints(token_ids) for the common case.
ModelInputChunk: A discriminated union of EncodedTextChunk (a list of token IDs) and ImageChunk (a base64-encoded image with an expected token count).
TensorData: A serializable tensor with a flat data list, a dtype string, and a shape. Convert to and from torch.Tensor with TensorData.to_torch() and TensorData.from_torch(tensor).
SamplingParams: Controls for text generation: temperature, top_p, top_k, max_tokens, seed, and stop.
AdamParams: Optimizer hyperparameters: learning_rate, beta1, beta2, eps, weight_decay, and grad_clip_norm.
SampledSequence: A single generated sequence: a list of output token IDs, optional per-token log-probabilities, and a stop reason.
SampleResult: The full response from sample(): a list of SampledSequence objects, an optional PromptLogprobs, and the policy_version the sampler replica was running.
Checkpoint: Metadata for a saved checkpoint, populated by list_checkpoints().
CheckpointFilesResponse: A paginated list of presigned file URLs for a checkpoint, populated by get_checkpoint_archive_url().
OperationFuture[T]: A handle to a long-running training operation. Call .result() or .result(timeout=seconds) to block until the operation completes and return the result.
Errors
RemoteOpError: The server reports that an async operation failed. The error_class attribute carries the server-side exception class name (for example, "ValueError" or "DispatcherError"), which is useful for routing in caller code.
UnknownRequestError: The server returned 404 for an operation ID. This can mean the server has no record of the operation (after a pod restart, for example) or that the result was TTL-evicted. Resubmit the operation if the work is still needed; the server’s idempotency-key deduplication prevents double-execution.
ServerShutdownError: The server is shutting down (503 response). Retry the request against a different replica.
tinker-loops compatibility package
Installtinker-loops alongside baseten-loops and import tinker. The tinker-loops package maps Tinker’s client interface onto the Loops SDK, so existing training scripts that import from tinker run without modification. The underlying classes (ServiceClient, TrainingClient, SamplingClient) and every method on them are the same; only the import path changes. For the full list of mapped names and any behavioral differences, see the Tinker compatibility guide.