Datum lists, loop over it, save checkpoints you can resume from, and evaluate against the live sampler between steps.
The example dataset is pirate-ultrachat-10k, chat-format conversations that teach the model pirate dialect. It’s the same dataset the Truss Train tutorial uses, so you can compare the two paths on identical work.
Turn a dataset into training data
Each training example becomes aDatum: input tokens plus loss targets that mask the prompt and supervise the answer, with the same label shift the quickstart uses. For chat data, render the conversation with the tokenizer’s chat template and treat the final assistant message as the answer.
Add datasets to your project (uv add datasets) and start train_dataset.py:
train_dataset.py
train[:64] slice keeps this guide’s run short. Use the full split for a real fine-tune.
Run the training loop
Each iteration is the quickstart’s round trip over a batch: oneforward_backward() on a list of Datum, one optim_step(). Append:
train_dataset.py
Save a resumable checkpoint
The quickstart’ssave_weights_for_sampler() publishes weights for sampling and deployment but omits optimizer state. For a checkpoint you can resume training from, use save_state(); to publish the same point for sampling, save both. Append:
train_dataset.py
load_state_with_optimizer() with the saved state.path.
Evaluate against the live sampler
The sampler already has your published weights, so an eval between epochs is one call, no deploy step. Append:train_dataset.py
uv run python train_dataset.py. Values vary, but a successful run prints falling losses, both checkpoint paths, and a completion:
Next steps
- Deploy a checkpoint: Serve
epoch-1as a production endpoint. - RL and advanced recipes: The Tinker cookbook recipes run on Loops; see Tinker compatibility for setup.