Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.baseten.co/llms.txt

Use this file to discover all available pages before exploring further.

VS Code Remote Tunnels and their Cursor equivalent let your local IDE attach to a training container without SSH keys, open ports, or direct network access. You authenticate via a device code flow through Microsoft or GitHub, and the tunnel connects your IDE to the container securely. Use a remote tunnel to debug a failed training job, inspect state on a running job, or develop interactively without resubmitting.
If you prefer a standard terminal over an IDE, see SSH access for direct SSH connections to training containers.

Prerequisites

  • VS Code or Cursor installed locally.
  • The Remote - Tunnels extension installed in your IDE.
  • A Microsoft or GitHub account for device flow authentication.

Quick start

This walkthrough uses the MNIST PyTorch example to push a training job with a remote tunnel enabled, then connects to the container.

1. Clone the example

Clone the ml-cookbook and navigate to the MNIST training example:
git clone https://github.com/basetenlabs/ml-cookbook.git
cd ml-cookbook/examples/mnist-pytorch/training

2. Configure and push the job

Add an interactive session to your config.py:
config.py
from truss_train import TrainingProject, TrainingJob, Image, Compute, Runtime
from truss_train.definitions import (
    InteractiveSession,
    InteractiveSessionTrigger,
    InteractiveSessionAuthProvider,
)
from truss.base.truss_config import AcceleratorSpec

training_job = TrainingJob(
    image=Image(base_image="pytorch/pytorch:2.7.0-cuda12.8-cudnn9-runtime"),
    compute=Compute(
        accelerator=AcceleratorSpec(accelerator="H200", count=1),
    ),
    runtime=Runtime(
        start_commands=["python train.py"],
    ),
    interactive_session=InteractiveSession(
        trigger=InteractiveSessionTrigger.ON_STARTUP,
        auth_provider=InteractiveSessionAuthProvider.MICROSOFT, # You can also use GITHUB
    ),
)

training_project = TrainingProject(name="mnist-training", job=training_job)
Push the job:
truss train push config.py
Once the job is running, retrieve the auth code using truss train isession:
truss train isession --job-id <job_id>
The expected output will look similar to this:
Interactive Sessions for Job: <job_id>
Replica ID  Tunnel Name              Auth Code  Auth URL                             Generated At (Local)
r0          bt-session-<job_id>-0    AB12-CD34  https://login.microsoftonline.com/…  14:30:00 PST
You can also view this table in truss train logs --job-id <job_id> --tail alongside your training logs.

3. Authenticate and connect

Connecting to the tunnel relies on the Remote - Tunnels extension in your IDE.
  1. Open the Auth URL from the table in your browser.
  2. Enter the Auth Code shown in the table.
  3. Connect to the tunnel in your IDE:
  1. Open the command palette (Cmd+Shift+P on macOS, Ctrl+Shift+P on Windows/Linux).
  2. Select Remote-Tunnels: Connect to Tunnel.
  3. Select the tunnel named bt-session-<job_id>-<node_rank> (for example, bt-session-abc123-0).
Open your workspace to the desired folder path to start debugging, editing your training script, or running commands. By default, your source files are extracted to /b10/workspace (available as $BT_WORKING_DIR). If you set enable_baseten_workdir=False, Baseten uses your base image’s WORKDIR instead.

Trigger modes and session management

Trigger modes (on_startup, on_failure, on_demand), activating on-demand sessions, viewing status with truss train isession, and extending session timeouts are shared across SSH and remote tunnel sessions. For more information, see Remote access.
For remote tunnel sessions specifically: auth codes appear in truss train isession as soon as the tunnel starts, regardless of trigger mode. With on_failure, the container stays alive for interactive use only after training fails. With on_demand, the container stays alive only after you authenticate or explicitly change the trigger.

Configuration

Configure interactive sessions with CLI flags or the Python SDK. CLI flags override SDK values when both are set.
Pass --interactive to truss train push with a trigger mode:
truss train push config.py \
  --interactive on_startup \
  --interactive-timeout-minutes 120
Set timeout_minutes to -1 to extend the session expiry to 10 years. See Timeout and expiry for details.For SSH-enabled interactive workstations, use truss train workstation.See the CLI reference for all push options.

Timeout and expiry

Sessions expire based on the timeout_minutes setting (default: 480 minutes, or 8 hours). Set timeout_minutes to -1 to extend the expiry to 10 years.
  1. When the tunnel starts successfully, Baseten sets the expiry to now + timeout_minutes.
  2. Each time the tunnel reconnects, the expiry resets to now + timeout_minutes.
  3. When the expiry passes, the session ends and the container shuts down.
The timeout resets on tunnel reconnection, not on general IDE activity. If you disconnect and reconnect, the timer resets. If you stay connected but idle, the session expires after the configured timeout.

What happens when a session expires

When a session expires, Baseten signals the container to shut down gracefully. Baseten doesn’t hard-kill the container. It receives the signal and exits cleanly. Baseten preserves any files you saved to $BT_CHECKPOINT_DIR, but you lose unsaved work in the container’s local filesystem.

Multi-node sessions

For multi-node training jobs, Baseten creates one tunnel per node. Each node gets its own auth code, and you connect to each node independently. Tunnel names follow the format bt-session-<job_id>-<node_rank>, where node_rank starts at 0. For example, a 2-node job produces:
  • bt-session-abc123-0 (node 0)
  • bt-session-abc123-1 (node 1)
The truss train isession command displays auth codes for all nodes in a single table.