# Baseten ## Docs - [How Baseten works](https://docs.baseten.co/concepts/howbasetenworks.md): Baseten is a platform for building, serving, and scaling AI models in production. - [Why Baseten](https://docs.baseten.co/concepts/whybaseten.md): Baseten delivers fast, scalable AI/ML inference with enterprise-grade security and reliability—whether in our cloud or yours. - [Autoscaling](https://docs.baseten.co/deployment/autoscaling.md): Autoscaling dynamically adjusts the number of active replicas to **handle variable traffic** while minimizing idle compute costs. - [Concepts](https://docs.baseten.co/deployment/concepts.md) - [Deployments](https://docs.baseten.co/deployment/deployments.md): Deploy, manage, and scale machine learning models with Baseten - [Environments](https://docs.baseten.co/deployment/environments.md): Manage your model’s release cycles with environments. - [Resources](https://docs.baseten.co/deployment/resources.md): Manage and configure model resources - [Binary IO](https://docs.baseten.co/development/chain/binaryio.md): Performant serialization of numeric data - [Concepts](https://docs.baseten.co/development/chain/concepts.md): Glossary of Chains concepts and terminology - [Deploy](https://docs.baseten.co/development/chain/deploy.md): Deploy your Chain on Baseten - [Architecture and design](https://docs.baseten.co/development/chain/design.md): How to structure your Chainlets - [Engine-Builder LLM Models](https://docs.baseten.co/development/chain/engine-builder-models.md): Engine-Builder LLM models are pre-trained models that are optimized for specific inference tasks. - [Error Handling](https://docs.baseten.co/development/chain/errorhandling.md): Understanding and handling Chains errors - [Your first Chain](https://docs.baseten.co/development/chain/getting-started.md): Build and deploy two example Chains - [Invocation](https://docs.baseten.co/development/chain/invocation.md): Call your deployed Chain - [Local Development](https://docs.baseten.co/development/chain/localdev.md): Iterating, Debugging, Testing, Mocking - [Overview](https://docs.baseten.co/development/chain/overview.md) - [Streaming](https://docs.baseten.co/development/chain/streaming.md): Streaming outputs, reducing latency, SSEs - [Truss Integration](https://docs.baseten.co/development/chain/stub.md): Integrate deployed Truss models with stubs - [Subclassing](https://docs.baseten.co/development/chain/subclassing.md): Modularize and re-use Chainlet implementations - [Watch](https://docs.baseten.co/development/chain/watch.md): Live-patch deployed code - [Concepts](https://docs.baseten.co/development/concepts.md) - [Deprecation](https://docs.baseten.co/development/model-apis/deprecation.md): Baseten's deprecation policy for Model APIs - [Model APIs](https://docs.baseten.co/development/model-apis/overview.md): OpenAI-compatible endpoints for high-performance LLMs - [Rate limits and budgets](https://docs.baseten.co/development/model-apis/rate-limits-and-budgets.md): Rate limits and usage budgets for Model APIs - [Reasoning](https://docs.baseten.co/development/model-apis/reasoning.md): Control extended thinking for reasoning-capable models - [b10cache 🆕](https://docs.baseten.co/development/model/b10cache.md): Persist data across replicas or deployments - [Base Docker images](https://docs.baseten.co/development/model/base-images.md): A guide to configuring a base image for your truss - [Custom build commands](https://docs.baseten.co/development/model/build-commands.md): How to run your own docker commands during the build stage - [Your first model](https://docs.baseten.co/development/model/build-your-first-model.md): Build and deploy your first model - [Python driven configuration for models 🆕](https://docs.baseten.co/development/model/code-first-development.md): Use code-first development tools to streamline model production. - [Request concurrency](https://docs.baseten.co/development/model/concurrency.md): A guide to setting concurrency for your model - [Configuration](https://docs.baseten.co/development/model/configuration.md): How to configure your model. - [Custom health checks](https://docs.baseten.co/development/model/custom-health-checks.md): Customize the health of your deployments. - [Deploy custom Docker images](https://docs.baseten.co/development/model/custom-server.md): Deploy custom Docker images to run inference servers like vLLM, SGLang, Triton, or any containerized application. - [Data and storage](https://docs.baseten.co/development/model/data-directory.md): Load model weights without Hugging Face or S3 - [Deploy and iterate](https://docs.baseten.co/development/model/deploy-and-iterate.md): Deploy your model and quickly iterate on it. - [Access model environments](https://docs.baseten.co/development/model/environments.md): A guide to leveraging environments in your models - [gRPC 🆕](https://docs.baseten.co/development/model/grpc.md): Invoke your model over gRPC. - [Implementation](https://docs.baseten.co/development/model/implementation.md): How to implement your model. - [Cached weights 🆕](https://docs.baseten.co/development/model/model-cache.md): Accelerate cold starts and availability by prefetching and caching your weights. - [Developing a Model on Baseten](https://docs.baseten.co/development/model/overview.md): This page introduces the key concepts and workflow you'll use to package, configure, and iterate on models using Baseten's developer tooling. - [Performance optimization](https://docs.baseten.co/development/model/performance-optimization.md): Optimize model latency, throughput, and cost with Baseten engines - [Private Docker registries](https://docs.baseten.co/development/model/private-registries.md): A guide to configuring a private container registry for your truss - [Using request objects / cancellation](https://docs.baseten.co/development/model/requests.md): Get more control by directly using the request object. - [Custom responses](https://docs.baseten.co/development/model/responses.md): Get more control by directly creating the response object. - [Secrets](https://docs.baseten.co/development/model/secrets.md): Use secrets securely in your models - [Streaming output](https://docs.baseten.co/development/model/streaming.md): Streaming Output for LLMs - [Torch compile caching 🆕](https://docs.baseten.co/development/model/torch-compile-cache.md): Accelerate cold starts by loading in previous compilation artifacts. - [WebSockets 🆕](https://docs.baseten.co/development/model/websockets.md): Enable real-time, streaming, bidirectional communication using WebSockets for Truss models and Chainlets. - [BEI-Bert](https://docs.baseten.co/engines/bei/bei-bert.md): BERT-optimized embeddings with cold-start performance - [Configuration reference](https://docs.baseten.co/engines/bei/bei-reference.md): Complete reference config for BEI and BEI-Bert engines - [Overview](https://docs.baseten.co/engines/bei/overview.md): Production-grade embeddings, reranking, and classification models - [Gated features for BIS-LLM](https://docs.baseten.co/engines/bis-llm/advanced-features.md): KV-aware routing, disaggregated serving, and other gated features - [Reference Config (BIS-LLM)](https://docs.baseten.co/engines/bis-llm/bis-llm-config.md): Complete reference config for V2 inference stack and MoE models - [Overview](https://docs.baseten.co/engines/bis-llm/overview.md): Next-generation engine for MoE models with advanced optimizations - [Custom engine builder](https://docs.baseten.co/engines/engine-builder-llm/custom-engine-builder.md): Implement custom model.py for business logic, logging, and advanced inference patterns - [Reference config (Engine-Builder-LLM)](https://docs.baseten.co/engines/engine-builder-llm/engine-builder-config.md): Complete reference config for dense text generation models - [Speculative decoding guide](https://docs.baseten.co/engines/engine-builder-llm/lookahead-decoding.md): Faster inference with speculative decoding for coding agents and text generation - [LoRA support](https://docs.baseten.co/engines/engine-builder-llm/lora-support.md): Multi-LoRA adapters for Engine-Builder-LLM engine - [Overview](https://docs.baseten.co/engines/engine-builder-llm/overview.md): Dense LLM text generation with lookahead decoding and structured outputs - [Overview](https://docs.baseten.co/engines/index.md): Engine selection guide for embeddings, dense LLMs, and MoE models - [Auto-Scaling Engines](https://docs.baseten.co/engines/performance-concepts/autoscaling-engines.md): Performant auto-scaling custom tailored to Embedding and Generation Models on Baseten - [Deploy training and S3 checkpoints](https://docs.baseten.co/engines/performance-concepts/deployment-from-training-and-s3.md): Deploy training checkpoints and cloud storage models with TensorRT-LLM optimization. - [Function calling](https://docs.baseten.co/engines/performance-concepts/function-calling.md): Tool selection and structured function calls with LLMs - [Performance client](https://docs.baseten.co/engines/performance-concepts/performance-client.md): High-performance client library for embeddings, reranking, classification, and generic batch requests - [Quantization guide](https://docs.baseten.co/engines/performance-concepts/quantization-guide.md): FP8 and FP4 trade-offs and hardware requirements for all engines - [Structured outputs](https://docs.baseten.co/engines/performance-concepts/structured-outputs.md): JSON schema validation and controlled text generation across all engines - [Embeddings with BEI](https://docs.baseten.co/examples/bei.md): Serve embedding, reranking, and classification models - [Transcribe audio with Chains](https://docs.baseten.co/examples/chains-audio-transcription.md): Process hours of audio in seconds using efficient chunking, distributed inference, and optimized GPU resources. - [RAG pipeline with Chains](https://docs.baseten.co/examples/chains-build-rag.md): Build a RAG (retrieval-augmented generation) pipeline with Chains - [Deploy a ComfyUI project](https://docs.baseten.co/examples/comfyui.md): Deploy your ComfyUI workflow as an API endpoint - [Deploy your first model](https://docs.baseten.co/examples/deploy-your-first-model.md): Learn how to package and deploy an AI model as a production-ready API endpoint on Baseten. - [Dockerized model](https://docs.baseten.co/examples/docker.md): Deploy any model in a pre-built Docker container - [Image generation](https://docs.baseten.co/examples/image-generation.md): Building a text-to-image model with Flux Schnell - [Deepseek R1](https://docs.baseten.co/examples/models/deepseek/deepseek-r1.md): A state-of-the-art 671B-parameter MoE LLM with o1-style reasoning licensed for commercial use - [DeepSeek-R1 Qwen 7B](https://docs.baseten.co/examples/models/deepseek/deepseek-r1-qwen-7b.md): Qwen 7B fine-tuned for CoT reasoning capabilities with DeepSeek R1 - [Flux-Schnell](https://docs.baseten.co/examples/models/flux/flux-schnell.md): Flux-Schnell is a state-of-the-art image generation model - [Gemma 3 27B IT](https://docs.baseten.co/examples/models/gemma/gemma-3-27b-it.md): Instruct-tuned open model by Google with excellent ELO/size tradeoff and vision capabilities - [Kokoro](https://docs.baseten.co/examples/models/kokoro/kokoro.md): Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). - [Llama 3.3 70B Instruct](https://docs.baseten.co/examples/models/llama/llama-3.3-70B-instruct.md): Llama 3.3 70B Instruct is a large language model that is optimized for instruction following. - [MARS6](https://docs.baseten.co/examples/models/mars/MARS6.md): MARS6 is a frontier text-to-speech model by CAMB.AI with voice/prosody cloning capabilities in 10 languages. MARS6 must be licensed for commercial use, we can help! - [All MPNet Base V2](https://docs.baseten.co/examples/models/microsoft/all-mpnet-base-v2.md): A text embedding model with a context window of 384 tokens and a dimensionality of 768 values. - [Nomic Embed v1.5](https://docs.baseten.co/examples/models/nomic/nomic-embed-v1-5.md): SOTA text embedding model with variable dimensionality — outperforms OpenAI text-embedding-ada-002 and text-embedding-3-small models. - [Overview](https://docs.baseten.co/examples/models/overview.md): Browse our library of open source models that are ready to deploy behind an API endpoint in seconds. - [Qwen-2-5-32B-Coder-Instruct](https://docs.baseten.co/examples/models/qwen/qwen-2-5-32b-coder-instruct.md): Qwen 2.5 32B Coder is an OpenAI-compatible model and can be called using the OpenAI SDK in any language. - [SDXL Lightning](https://docs.baseten.co/examples/models/stable-diffusion/sdxl-lightning.md): A variant of Stable Diffusion XL that generates 1024x1024 px images in 4 UNet steps, enabling near real-time image creation. - [Whisper V3](https://docs.baseten.co/examples/models/whisper/whisper-v3-fastest.md): Whisper V3 is a fast and accurate speech recognition model. - [Building with Baseten](https://docs.baseten.co/examples/overview.md) - [Deploy LLMs with SGLang](https://docs.baseten.co/examples/sglang.md): Optimized inference for LLMs with SGLang - [Speculative Decoding Examples](https://docs.baseten.co/examples/speculative-decoding.md): Lookahead decoding configurations for faster inference - [LLM with Streaming](https://docs.baseten.co/examples/streaming.md): Building an LLM with streaming output - [Fast LLMs with TensorRT-LLM](https://docs.baseten.co/examples/tensorrt-llm.md): Optimize LLMs for low latency and high throughput - [Text to speech](https://docs.baseten.co/examples/text-to-speech.md): Building a text-to-speech model with Kokoro - [Run any LLM with vLLM](https://docs.baseten.co/examples/vllm.md): Serve a wide range of models - [Async inference](https://docs.baseten.co/inference/async.md): Run asynchronous inference on deployed models - [Call your model](https://docs.baseten.co/inference/calling-your-model.md): Run inference on deployed models - [Concepts](https://docs.baseten.co/inference/concepts.md) - [Integrations](https://docs.baseten.co/inference/integrations.md): Integrate your models with tools like LangChain, LiteLLM, and more. - [Model I/O in binary](https://docs.baseten.co/inference/output-format/binary.md): Decode and save binary model output - [Model I/O with files](https://docs.baseten.co/inference/output-format/files.md): Call models by passing a file or URL - [Streaming](https://docs.baseten.co/inference/streaming.md): How to call a model that has a streaming-capable endpoint. - [Export to Datadog](https://docs.baseten.co/observability/export-metrics/datadog.md): Export metrics from Baseten to Datadog - [Export to Grafana Cloud](https://docs.baseten.co/observability/export-metrics/grafana.md): Export metrics from Baseten to Grafana Cloud - [Export to New Relic](https://docs.baseten.co/observability/export-metrics/new-relic.md): Export metrics from Baseten to New Relic - [Overview](https://docs.baseten.co/observability/export-metrics/overview.md): Export metrics from Baseten to your observability stack - [Export to Prometheus](https://docs.baseten.co/observability/export-metrics/prometheus.md): Export metrics from Baseten to Prometheus - [Metrics support matrix](https://docs.baseten.co/observability/export-metrics/supported-metrics.md): Which metrics can be exported - [Status and health](https://docs.baseten.co/observability/health.md): Every model deployment in your Baseten workspace has a status to represent its activity and health. - [Metrics](https://docs.baseten.co/observability/metrics.md): Understand the load and performance of your model - [Secure model inference](https://docs.baseten.co/observability/security.md): Keeping your models safe and private - [Tracing](https://docs.baseten.co/observability/tracing.md): Investigate the prediction flow in detail - [Billing and usage](https://docs.baseten.co/observability/usage.md): Manage payments and track overall Baseten usage - [Access control](https://docs.baseten.co/organization/access.md): Manage access to your Baseten organization with role-based access control. - [API keys](https://docs.baseten.co/organization/api-keys.md): Authenticate requests to Baseten for deployment, inference, and management. - [Organization settings](https://docs.baseten.co/organization/overview.md): Manage your Baseten organization's access, security, and resources. - [Restricted environments](https://docs.baseten.co/organization/restricted-environments.md): Control access to sensitive environments like production with environment-level permissions. - [Secrets](https://docs.baseten.co/organization/secrets.md): Store and access sensitive credentials in your deployed models. - [Teams 🆕](https://docs.baseten.co/organization/teams.md): Organize your organization into multiple teams with isolated resources and granular access control. - [Documentation](https://docs.baseten.co/overview.md): Baseten is a platform for deploying and serving AI models performantly, scalably, and cost-efficiently. - [Quick start](https://docs.baseten.co/quickstart.md) - [Chains CLI reference](https://docs.baseten.co/reference/cli/chains/chains-cli.md): Deploy, manage, and develop Chains using the Truss CLI. - [Truss CLI overview](https://docs.baseten.co/reference/cli/index.md): Install and configure the Truss CLI for deploying models, chains, and training jobs. - [Training CLI reference](https://docs.baseten.co/reference/cli/training/training-cli.md): Deploy, manage, and monitor training jobs using the Truss CLI. - [truss cleanup](https://docs.baseten.co/reference/cli/truss/cleanup.md): Clean up Truss data. - [truss configure](https://docs.baseten.co/reference/cli/truss/configure.md): Configure Truss settings. - [truss container](https://docs.baseten.co/reference/cli/truss/container.md): Run and manage Truss containers locally. - [truss image](https://docs.baseten.co/reference/cli/truss/image.md): Build and manage Truss Docker images. - [truss init](https://docs.baseten.co/reference/cli/truss/init.md): Create a new Truss project. - [truss login](https://docs.baseten.co/reference/cli/truss/login.md): Authenticate with Baseten. - [truss model-logs](https://docs.baseten.co/reference/cli/truss/model-logs.md): Fetch logs for a deployed model. - [Truss CLI reference](https://docs.baseten.co/reference/cli/truss/overview.md): Deploy, manage, and develop models using the Truss CLI. - [truss predict](https://docs.baseten.co/reference/cli/truss/predict.md): Call the packaged model. - [truss push](https://docs.baseten.co/reference/cli/truss/push.md): Deploy a model to Baseten. - [truss run-python](https://docs.baseten.co/reference/cli/truss/run-python.md): Run a Python script in the Truss environment. - [truss watch](https://docs.baseten.co/reference/cli/truss/watch.md): Live reload during development. - [truss whoami](https://docs.baseten.co/reference/cli/truss/whoami.md): Show user information. - [Chat Completions](https://docs.baseten.co/reference/inference-api/chat-completions.md): Creates a chat completion for the provided conversation. This endpoint is fully compatible with the OpenAI Chat Completions API, allowing you to use standard OpenAI SDKs by changing only the base URL and API key. - [Overview](https://docs.baseten.co/reference/inference-api/overview.md): The inference API is used to call deployed models and chains. - [Async cancel request](https://docs.baseten.co/reference/inference-api/predict-endpoints/cancel-async-request.md): Use this endpoint to cancel a queued async request. - [Async deployment](https://docs.baseten.co/reference/inference-api/predict-endpoints/deployment-async-predict.md): Use this endpoint to call any [published deployment](/deploy/lifecycle) of your model. - [Async chains deployment](https://docs.baseten.co/reference/inference-api/predict-endpoints/deployment-async-run-remote.md) - [Deployment](https://docs.baseten.co/reference/inference-api/predict-endpoints/deployment-predict.md) - [Chains deployment](https://docs.baseten.co/reference/inference-api/predict-endpoints/deployment-run-remote.md) - [Websocket deployment](https://docs.baseten.co/reference/inference-api/predict-endpoints/deployment-websocket.md) - [Async development](https://docs.baseten.co/reference/inference-api/predict-endpoints/development-async-predict.md): Use this endpoint to call the [development deployment](/deploy/lifecycle) of your model asynchronously. - [Async chains development](https://docs.baseten.co/reference/inference-api/predict-endpoints/development-async-run-remote.md) - [Development](https://docs.baseten.co/reference/inference-api/predict-endpoints/development-predict.md) - [Chains development](https://docs.baseten.co/reference/inference-api/predict-endpoints/development-run-remote.md) - [Websocket development](https://docs.baseten.co/reference/inference-api/predict-endpoints/development-websocket.md) - [Async environment](https://docs.baseten.co/reference/inference-api/predict-endpoints/environments-async-predict.md): Use this endpoint to call the model associated with the specified environment asynchronously. - [Async chains environment](https://docs.baseten.co/reference/inference-api/predict-endpoints/environments-async-run-remote.md): Use this endpoint to call the deployment associated with the specified environment asynchronously. - [Environment](https://docs.baseten.co/reference/inference-api/predict-endpoints/environments-predict.md) - [Chains environment](https://docs.baseten.co/reference/inference-api/predict-endpoints/environments-run-remote.md): Use this endpoint to call the deployment associated with the specified environment. - [Websocket environment](https://docs.baseten.co/reference/inference-api/predict-endpoints/environments-websocket.md) - [Async deployment](https://docs.baseten.co/reference/inference-api/status-endpoints/deployment-get-async-queue-status.md): Use this endpoint to get the status of a published deployment's async queue. - [Async development](https://docs.baseten.co/reference/inference-api/status-endpoints/development-get-async-queue-status.md): Use this endpoint to get the status of a development deployment's async queue. - [Async environment](https://docs.baseten.co/reference/inference-api/status-endpoints/environments-get-async-queue-status.md): Use this endpoint to get the async queue status for a model associated with the specified environment. - [Async request](https://docs.baseten.co/reference/inference-api/status-endpoints/get-async-request-status.md): Use this endpoint to get the status of an async request. - [Deployment](https://docs.baseten.co/reference/inference-api/wake/deployment-wake.md) - [Development](https://docs.baseten.co/reference/inference-api/wake/development-wake.md) - [Production](https://docs.baseten.co/reference/inference-api/wake/production-wake.md) - [Create an API key](https://docs.baseten.co/reference/management-api/api-keys/creates-an-api-key.md): Creates an API key with the provided name and type. The API key is returned in the response. - [Delete an API key](https://docs.baseten.co/reference/management-api/api-keys/delete-an-api-key.md): Deletes an API key by prefix and returns info about the API key. - [Get all API keys](https://docs.baseten.co/reference/management-api/api-keys/lists-the-users-api-keys.md): Lists all API keys your account has access to. - [Delete chains](https://docs.baseten.co/reference/management-api/chains/deletes-a-chain-by-id.md) - [By ID](https://docs.baseten.co/reference/management-api/chains/gets-a-chain-by-id.md) - [All chains](https://docs.baseten.co/reference/management-api/chains/gets-all-chains.md) - [Any deployment by ID](https://docs.baseten.co/reference/management-api/deployments/activate/activates-a-deployment.md): Activates an inactive deployment and returns the activation status. - [Activate environment deployment](https://docs.baseten.co/reference/management-api/deployments/activate/activates-a-deployment-associated-with-an-environment.md): Activates an inactive deployment associated with an environment and returns the activation status. - [Development deployment](https://docs.baseten.co/reference/management-api/deployments/activate/activates-a-development-deployment.md): Activates an inactive development deployment and returns the activation status. - [Update chainlet environment's autoscaling settings](https://docs.baseten.co/reference/management-api/deployments/autoscaling/update-a-chainlet-environments-autoscaling-settings.md): Updates a chainlet environment's autoscaling settings and returns the updated chainlet environment settings. - [Any model deployment by ID](https://docs.baseten.co/reference/management-api/deployments/autoscaling/updates-a-deployments-autoscaling-settings.md): Updates a deployment's autoscaling settings and returns the update status. - [Development model deployment](https://docs.baseten.co/reference/management-api/deployments/autoscaling/updates-a-development-deployments-autoscaling-settings.md): Updates a development deployment's autoscaling settings and returns the update status. - [Any deployment by ID](https://docs.baseten.co/reference/management-api/deployments/deactivate/deactivates-a-deployment.md): Deactivates a deployment and returns the deactivation status. - [Deactivate environment deployment](https://docs.baseten.co/reference/management-api/deployments/deactivate/deactivates-a-deployment-associated-with-an-environment.md): Deactivates a deployment associated with an environment and returns the deactivation status. - [Development deployment](https://docs.baseten.co/reference/management-api/deployments/deactivate/deactivates-a-development-deployment.md): Deactivates a development deployment and returns the deactivation status. - [Delete chain deployment](https://docs.baseten.co/reference/management-api/deployments/deletes-a-chain-deployment-by-id.md) - [Delete model deployments](https://docs.baseten.co/reference/management-api/deployments/deletes-a-models-deployment-by-id.md): Deletes a model's deployment by ID and returns the tombstone of the deployment. - [Any chain deployment by ID](https://docs.baseten.co/reference/management-api/deployments/gets-a-chain-deployment-by-id.md) - [Any model deployment by ID](https://docs.baseten.co/reference/management-api/deployments/gets-a-models-deployment-by-id.md): Gets a model's deployment by ID and returns the deployment. - [Development model deployment](https://docs.baseten.co/reference/management-api/deployments/gets-a-models-development-deployment.md): Gets a model's development deployment and returns the deployment. - [Production model deployment](https://docs.baseten.co/reference/management-api/deployments/gets-a-models-production-deployment.md): Gets a model's production deployment and returns the deployment. - [Get all chain deployments](https://docs.baseten.co/reference/management-api/deployments/gets-all-chain-deployments.md) - [Get all model deployments](https://docs.baseten.co/reference/management-api/deployments/gets-all-deployments-of-a-model.md) - [Cancel model promotion](https://docs.baseten.co/reference/management-api/deployments/promote/cancel-promotion.md): Cancels an ongoing promotion to an environment and returns the cancellation status. - [Promote to chain environment](https://docs.baseten.co/reference/management-api/deployments/promote/promotes-a-chain-deployment-to-an-environment.md): Promotes an existing chain deployment to an environment and returns the promoted chain deployment. - [Promote to model environment](https://docs.baseten.co/reference/management-api/deployments/promote/promotes-a-deployment-to-an-environment.md): Promotes an existing deployment to an environment and returns the promoted deployment. - [Any model deployment by ID](https://docs.baseten.co/reference/management-api/deployments/promote/promotes-a-deployment-to-production.md): Promotes an existing deployment to production and returns the same deployment. - [Development model deployment](https://docs.baseten.co/reference/management-api/deployments/promote/promotes-a-development-deployment-to-production.md): Creates a new production deployment from the development deployment, the currently building deployment is returned. - [Create Chain environment](https://docs.baseten.co/reference/management-api/environments/create-a-chain-environment.md): Create a chain environment. Returns the resulting environment. - [Create environment](https://docs.baseten.co/reference/management-api/environments/create-an-environment.md): Creates an environment for the specified model and returns the environment. - [Get Chain environment](https://docs.baseten.co/reference/management-api/environments/get-a-chain-environments-details.md): Gets a chain environment's details and returns the chain environment. - [Get all Chain environments](https://docs.baseten.co/reference/management-api/environments/get-all-chain-environments.md): Gets all chain environments for a given chain - [Get all environments](https://docs.baseten.co/reference/management-api/environments/get-all-environments.md): Gets all environments for a given model - [Get environment](https://docs.baseten.co/reference/management-api/environments/get-an-environments-details.md): Gets an environment's details and returns the environment. - [Update Chain environment](https://docs.baseten.co/reference/management-api/environments/update-a-chain-environments-settings.md): Update a chain environment's settings and returns the chain environment. - [Update chainlet environment's instance type](https://docs.baseten.co/reference/management-api/environments/update-a-chainlet-environments-instance-type-settings.md): Updates a chainlet environment's instance type settings. The chainlet environment setting must exist. When updated, a new chain deployment is created and deployed. It is promoted to the chain environment according to promotion settings on the environment. - [Update model environment](https://docs.baseten.co/reference/management-api/environments/update-an-environments-settings.md): Updates an environment's settings and returns the updated environment. - [All instance types](https://docs.baseten.co/reference/management-api/instance-types/gets-all-instance-types.md) - [Instance type prices](https://docs.baseten.co/reference/management-api/instance-types/gets-instance-type-prices.md) - [Delete models](https://docs.baseten.co/reference/management-api/models/deletes-a-model-by-id.md) - [By ID](https://docs.baseten.co/reference/management-api/models/gets-a-model-by-id.md) - [All models](https://docs.baseten.co/reference/management-api/models/gets-all-models.md) - [Overview](https://docs.baseten.co/reference/management-api/overview.md): The management API is used to manage models and deployments. It supports monitoring, CI/CD, and automation at both the model and workspace levels. - [Get all secrets](https://docs.baseten.co/reference/management-api/secrets/gets-all-secrets.md) - [Upsert a secret](https://docs.baseten.co/reference/management-api/secrets/upserts-a-secret.md): Creates a new secret or updates an existing secret if one with the provided name already exists. The name and creation date of the created or updated secret is returned. - [Create a team API key](https://docs.baseten.co/reference/management-api/teams/creates-a-team-api-key.md): Creates a team API key with the provided name and type. The API key is returned in the response. - [Create a team training project](https://docs.baseten.co/reference/management-api/teams/creates-a-team-training-project.md): Upserts a training project with the specified metadata for a team. - [Get all team secrets](https://docs.baseten.co/reference/management-api/teams/gets-all-team-secrets.md) - [List all teams](https://docs.baseten.co/reference/management-api/teams/lists-all-teams.md): Returns a list of all teams the authenticated user has access to. - [Upsert a team secret](https://docs.baseten.co/reference/management-api/teams/upserts-a-team-secret.md): Creates a new secret or updates an existing secret if one with the provided name already exists. The name and creation date of the created or updated secret is returned. This secret belongs to the specified team - [Reference documentation](https://docs.baseten.co/reference/overview.md): For deploying, managing, and interacting with machine learning models on Baseten. - [Chains SDK Reference](https://docs.baseten.co/reference/sdk/chains.md): Python SDK Reference for Chains - [Training SDK](https://docs.baseten.co/reference/sdk/training.md): Configure and manage training jobs with Baseten's training SDK. - [Truss SDK Reference](https://docs.baseten.co/reference/sdk/truss.md): Python SDK for deploying and managing models with Truss. - [Create training project](https://docs.baseten.co/reference/training-api/create-training-project.md): Upserts a training project with the specified metadata. - [Download training job source code](https://docs.baseten.co/reference/training-api/download-training-job.md): Get the uploaded training job as a S3 Artifact - [Get training job](https://docs.baseten.co/reference/training-api/get-training-job.md): Get the details of an existing training job. - [Get training job checkpoint files](https://docs.baseten.co/reference/training-api/get-training-job-checkpoint-files.md): Get presigned URLs for all checkpoint files for a training job. - [List training job checkpoints](https://docs.baseten.co/reference/training-api/get-training-job-checkpoints.md): Get the checkpoints for a training job. - [Get training job logs](https://docs.baseten.co/reference/training-api/get-training-job-logs.md): Get the logs for a training job with the provided filters. - [Get training job metrics](https://docs.baseten.co/reference/training-api/get-training-job-metrics.md): Get the metrics for a training job. - [List training projects](https://docs.baseten.co/reference/training-api/get-training-projects.md): List all training projects for the organization. - [List training jobs](https://docs.baseten.co/reference/training-api/list-training-jobs.md): List all training jobs for the training project. - [Overview](https://docs.baseten.co/reference/training-api/overview.md): The Training API enables programmatic management of Baseten Training resources. - [Recreate training job](https://docs.baseten.co/reference/training-api/recreate-training-job.md): Create a new training job with the same configuration as an existing training job. - [Search training jobs](https://docs.baseten.co/reference/training-api/search-training-jobs.md): Search training jobs for the organization. - [Stop training job](https://docs.baseten.co/reference/training-api/stop-training-job.md): Stops a training job. - [Truss configuration](https://docs.baseten.co/reference/truss-configuration.md): Set your model resources, dependencies, and more - [Baseten platform status](https://docs.baseten.co/status/status.md): Current operational status of Baseten's services. - [Basics](https://docs.baseten.co/training/concepts/basics.md): Learn how to get up and running on Baseten Training - [Cache](https://docs.baseten.co/training/concepts/cache.md): Learn how to use the training cache to speed up your training iterations by persisting data between jobs. - [Checkpointing](https://docs.baseten.co/training/concepts/checkpointing.md): Learn how to use Baseten's checkpointing feature to manage model checkpoints and avoid disk errors during training. - [Multinode Training](https://docs.baseten.co/training/concepts/multinode.md): Learn how to configure and run multinode training jobs with Baseten Training. - [Serving your trained model](https://docs.baseten.co/training/deployment.md): How to deploy checkpoints from Baseten Training jobs as usable models. - [Getting started](https://docs.baseten.co/training/getting-started.md): Your first steps to creating and running training jobs on Baseten. - [Lifecycle](https://docs.baseten.co/training/lifecycle.md): Understanding the different states and transitions in a Baseten training job's lifecycle. - [Loading Checkpoints](https://docs.baseten.co/training/loading.md): Resume training from existing checkpoints to continue where you left off - [Management](https://docs.baseten.co/training/management.md): How to monitor, manage, and interact with your Baseten Training projects and jobs. - [Training on Baseten](https://docs.baseten.co/training/overview.md): Own your intelligence and train custom models with our developer-first training infrastructure. - [Deployments](https://docs.baseten.co/troubleshooting/deployments.md): Troubleshoot common problems during model deployment - [Inference](https://docs.baseten.co/troubleshooting/inference.md): Troubleshoot common problems during model inference