Invoke your model over gRPC.
.proto
fileload()
, and predict()
methods, gRPC models run their own server process that handles gRPC requests directly.
This approach gives developers full control over the gRPC server implementation.
For this to work, you must first package your gRPC server code into a Docker image.
Once that is done, you can set up your Truss config.yaml
to configure your deployment
and push the server to Baseten.
.proto
file that defines the service interface and message types. Create an example.proto
file in your project root:
.proto
file:
example_pb2.py
and example_pb2_grpc.py
) for your gRPC service. For more information about Protocol Buffers, see the official documentation.
model.py
. Here’s a basic example:
Dockerfile
that bundles your gRPC server code and dependencies. Here’s a basic skeleton:
requirements.txt
file with your gRPC dependencies:
your-registry
with your actual container registry (e.g., Docker Hub, Google Container Registry, AWS ECR). You can create a Docker Hub container registry by following their documentation.config.yaml
to use the custom Docker image and configure the gRPC server:
--promote
or --publish
flags, since gRPC models aren’t supported in the development environment.