Data privacy
Baseten does not store model inputs, outputs, or weights by default.- Model inputs/outputs: Inputs for async inference are temporarily stored until processed. Outputs are never stored.
- Model weights: Loaded dynamically from sources like Hugging Face, GCS, or S3, moving directly to GPU memory.
- Users can enable caching via Truss. Cached weights can be permanently deleted on request.
- Postgres data tables: Existing users may store data in Baseten’s hosted Postgres tables, which can be deleted anytime.
Workload security
Inference workloads are isolated to protect users and Baseten’s infrastructure.- Container security:
- No GPUs are shared across users.
- Security tooling: Falco (Sysdig), Gatekeeper (Pod Security Policies).
- Minimal privileges for workloads and nodes to limit incident impact.
- Network security:
- Each customer has a dedicated Kubernetes namespace.
- Isolation enforced via Calico.
- Nodes run in a private subnet with firewall protections.
- Pentesting:
- Extended pentesting by RunSybil (ex-OpenAI and CrowdStrike experts).
- Malicious model deployments tested in a dedicated prod-like environment.