Overview
gRPC is a high-performance, open-source remote procedure call (RPC) framework that uses HTTP/2 for transport and Protocol Buffers for serialization. Unlike traditional HTTP APIs, gRPC provides strong type safety, high performance, and built-in support for streaming and bidirectional communication. Why use gRPC with Baseten?- Type safety: Protocol Buffers ensure strong typing and contract validation between client and server
- Ecosystem integration: Easily integrate Baseten with existing gRPC-based services
- Streaming support: Built-in support for server streaming, client streaming, and bidirectional streaming
- Language interoperability: Generate client libraries for multiple programming languages from a single
.proto
file
gRPC on Baseten
gRPC support in Baseten is implemented using Custom Servers. Unlike standard Truss models that use theload()
, and predict()
methods, gRPC models run their own server process that handles gRPC requests directly.
This approach gives developers full control over the gRPC server implementation.
For this to work, you must first package your gRPC server code into a Docker image.
Once that is done, you can set up your Truss config.yaml
to configure your deployment
and push the server to Baseten.
Setup
Installation
-
Install Truss:
-
Install Protocol Buffer compiler:
-
Install gRPC tools:
Protocol Buffer Definition
Your gRPC service starts with a.proto
file that defines the service interface and message types. Create an example.proto
file in your project root:
example.proto
Generate Protocol Buffer Code
Generate the Python code from your.proto
file:
example_pb2.py
and example_pb2_grpc.py
) for your gRPC service. For more information about Protocol Buffers, see the official documentation.
Model Implementation
Create your gRPC server implementation in a file calledmodel.py
. Hereβs a basic example:
model.py
Deployment
Step 1: Create a Dockerfile
Since gRPC on Baseten requires a custom server setup, youβll need to create aDockerfile
that bundles your gRPC server code and dependencies. Hereβs a basic skeleton:
Dockerfile
requirements.txt
file with your gRPC dependencies:
requirements.txt
Step 2: Build and Push Docker Image
Build and push your Docker image to a container registry:Replace
your-registry
with your actual container registry (e.g., Docker Hub, Google Container Registry, AWS ECR). You can create a Docker Hub container registry by following their documentation.Step 3: Configure Your Truss
Update yourconfig.yaml
to use the custom Docker image and configure the gRPC server:
config.yaml
Step 4: Deploy with Truss
Deploy your model using the Truss CLI. You need to use the--promote
or --publish
flags, since gRPC models arenβt supported in the development environment.
Calling Your Model
Using a gRPC Client
Once deployed, you can call your model using any gRPC client. Hereβs an example Python client:client.py