Documentation Index
Fetch the complete documentation index at: https://docs.baseten.co/llms.txt
Use this file to discover all available pages before exploring further.
Use b10cache for files your model writes at runtime that other replicas can reuse, such as
torch.compile artifacts. For read-only weights known at deploy time, use BDN. For more information, see Data and storage.Overview
Deployments sometimes produce files that are useful to other replicas. Usingtorch.compile, for example, produces a cache that can speed up future torch.compile calls on the same function, reducing cold start time for other replicas.
b10cache stores these files. It’s a volume mounted over the network onto each of your pods, with two scopes:
Organization scope: /cache/org/
Shared across every pod you deploy in your organization. Move a file into this directory and any pod can read it.
Deployment scope: /cache/model/
Shared across every pod within a single deployment. Use this scope to keep deployment filesystems isolated.
Not persistent object storage
b10cache is reliable, but treat it as a cache, not a database. Always have a fallback path that runs if the file isn’t there yet. For example, the first replica of a new deployment writes to b10cache rather than reading from it.Related features
For more information, see Torch compile caching, which uses b10cache to persisttorch.compile artifacts across replicas.