Training CLI reference
Deploy, manage, and monitor training jobs using the Truss CLI.
Deploy Training Jobs
push
Deploys and runs a training job.
CONFIG
: Path to a training configuration file.
Options:
--remote
(TEXT): Name of the remote in .trussrc to push to.--tail
: Tail status and logs after pushing the training job.--help
: Show this message and exit.
Monitor Training Jobs
logs
Fetch and display logs for a training job.
Options:
--remote
(TEXT): Name of the remote in .trussrc.--job-id
(TEXT): Job ID to fetch logs from.--tail
: Continuously stream new logs.--help
: Show this message and exit.
metrics
Get metrics for a training job.
Options:
--remote
(TEXT): Name of the remote in .trussrc.--job-id
(TEXT): Job ID to fetch metrics from.--help
: Show this message and exit.
view
List and view training jobs.
Options:
--remote
(TEXT): Name of the remote in .trussrc.--project-id
(TEXT): View training jobs for a specific project.--job-id
(TEXT): View details of a specific training job.--help
: Show this message and exit.
Manage Training Jobs
stop
Stop a running training job.
Options:
--remote
(TEXT): Name of the remote in .trussrc.--project-id
(TEXT): Specify the project to stop a training job from.--job-id
(TEXT): ID of the job to stop.--all
: Stop all running jobs.--help
: Show this message and exit.
deploy_checkpoints
Deploy model checkpoints from a training job.
get_checkpoint_urls
Get a list of URL’s to checkpoint artifacts for a training job
Options:
--remote
(TEXT): Name of the remote in .trussrc.--job-id
(TEXT): Job ID containing the checkpoints.--help
: Show this message and exit.
Options:
--remote
(TEXT): Name of the remote in .trussrc.--project-id
(TEXT): Project ID containing the checkpoints.--job-id
(TEXT): Job ID containing the checkpoints.--config
(TEXT): Path to a Python file defining a DeployCheckpointsConfig.--dry-run
: Generate a truss config without deploying.--help
: Show this message and exit.
recreate
Recreate an existing training job from an existing job ID. If no job ID is provided, it will default to the last created active training job and ask for confirmation first.
Options:
--job-id
(TEXT): Existing Job ID of Training Job to recreate.--remote
(TEXT): Name of the remote in .trussrc.--tail
: Tail status and logs after recreation.--help
: Show this message and exit.
download
Download training job artifacts.
Options:
--job-id
(TEXT): Job ID to download artifacts from. (Required)--remote
(TEXT): Name of the remote in .trussrc.--target-directory
(PATH): Directory where the file should be downloaded. Defaults to current directory.--no-unzip
: Instructs truss to not unzip the compressed file upon download.--help
: Show this message and exit.
Ignoring files and folders
To ignore specific files or folders, place a .truss_ignore
file in the root directory of your project. Define the files or folders you want truss
to ignore.
These can be absolute paths or paths relative to the location of the .truss_ignore