Deploy Training Jobs

push

truss train push [OPTIONS] CONFIG
Deploys and runs a training job.
  • CONFIG: Path to a training configuration file.
Options:
  • --tail: Tail status and logs after pushing the training job.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc to push to.

Monitor Training Jobs

logs

truss train logs [OPTIONS]
Fetch and display logs for a training job. Options:
  • --job-id (TEXT): Job ID to fetch logs from.
  • --tail: Continuously stream new logs.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.

metrics

truss train metrics [OPTIONS]
Get metrics for a training job. Options:
  • --job-id (TEXT): Job ID to fetch metrics from.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.

view

truss train view [OPTIONS]
List and view training jobs. Options:
  • --project-id (TEXT): View training jobs for a specific project.
  • --job-id (TEXT): View details of a specific training job.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.

Manage Training Jobs

stop

truss train stop [OPTIONS]
Stop a running training job. Options:
  • --project-id (TEXT): Specify the project to stop a training job from.
  • --job-id (TEXT): ID of the job to stop.
  • --all: Stop all running jobs.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.

Manage Training Cache

The training cache is scoped to a specific training project. The CLI allows you to see a summary of the contents in the cache to help you manage your storage.

cache summarize

truss train cache summarize <project_name or project_id> [OPTIONS]
View the contents of the training cache in a table. Optionally sort by different column names (e.g. modified, size, etc.) Options:
  • --sort (TEXT): column to sort by
  • --order (TEXT): Ascending (asc) or descending (desc) order for sorting
  • --remote (TEXT): Name of the remote in .trussrc.

Manage Checkpoints

deploy_checkpoints

truss train deploy_checkpoints [OPTIONS]
Deploy model checkpoints from a training job.

get_checkpoint_urls

truss train get_checkpoint_urls [OPTIONS]
Get a list of URL’s to checkpoint artifacts for a training job Options:
  • --job-id (TEXT): Job ID containing the checkpoints.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.
Options:
  • --project-id (TEXT): Project ID containing the checkpoints.
  • --job-id (TEXT): Job ID containing the checkpoints.
  • --config (TEXT): Path to a Python file defining a DeployCheckpointsConfig.
  • --dry-run: Generate a truss config without deploying.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.

recreate

truss train recreate [OPTIONS]
Recreate an existing training job from an existing job ID. If no job ID is provided, it will default to the last created active training job and ask for confirmation first. Options:
  • --job-id (TEXT): Existing Job ID of Training Job to recreate.
  • --tail: Tail status and logs after recreation.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.

download

truss train download [OPTIONS]
Download training job artifacts. Options:
  • --job-id (TEXT): Job ID to download artifacts from. (Required)
  • --target-directory (PATH): Directory where the file should be downloaded. Defaults to current directory.
  • --no-unzip: Instructs truss to not unzip the compressed file upon download.
  • --help: Show this message and exit.
  • --remote (TEXT): Name of the remote in .trussrc.

Ignoring files and folders

To ignore specific files or folders, place a .truss_ignore file in the root directory of your project. Define the files or folders you want truss to ignore. These can be absolute paths or paths relative to the location of the .truss_ignore
.truss_ignore
# Python cache files
__pycache__/
*.pyc
*.pyo
*.pyd

# Type checking
.mypy_cache/
# Testing
.pytest_cache/

# Some large files
data/