Deploy Custom Server from Docker image
A config.yaml is all you need
If you have a ready-to-use API server packaged in a Docker image, either an open source serve image like vLLM or a customized Docker image built in house, itβs very easy to deploy on Baseten β all you need is a config.yaml
file.
Specifying a Docker image in config.yaml
To specify a Docker image of Custom Server, in your config.yaml
, add a docker_server
field:
where
start_command
(required) is the command to start the serverpredict_endpoint
(required) is the endpoint to send requests to the server, please note that deployed models can only support a single predict endpoint at the momentserver_port
(required) is the port to run the server onreadiness_endpoint
(required) is the endpoint used as Kubernetes readiness probe to determine when a container is ready to start accepting trafficliveness_endpoint
(required) is the endpoint used as Kubernetes liveness probe to determine when to restart a container
Example usage: run vLLM server from Docker image
One great use case for Custom Server is to spin up a popular open source model server like vLLM OpenAI Compatible Server. Below is an example to deploy the Meta-Llama-3.1-8B-Instruct model using vLLM on 1 A10G GPU.
Also as you can see here, we are passing in /health
endpoint provided by vLLM server as both readiness_endpoint
and liveness_endpoint
, this way we can use the internal health probe of vLLM server to decide if the server is ready to accept requests, or if it is unhealthy and needs to be restarted.
More usage examples of Custom Server can be found here.
Installing custom python packages
If you need to install additional python packages, you can do so by adding a requirements.txt
file to your truss. The following example shows how to start the Infinity Embedding Model Server from a Docker image with python package infinity-embedding
installed.
Accessing secrets in Custom Server
As you might have noticed in the vLLM example above, you can access secrets in the Custom Server by reading them from the /secrets
directory if you have stored those secrets in Baseten. This is useful if you need to pass in environment variables or other secrets to your server.