If you need a deployed model to try the invocation examples below, follow these steps to create and deploy a super basic Truss that accepts and returns binary data. The Truss performs no operations and is purely illustrative.
1
Create a Truss
To create a Truss, run:
truss init binary_test
This creates a Truss in a new directory binary_test. By default, newly created Trusses implement an identity function that returns the exact input they are given.
2
Add logging
Optionally, modify binary_test/model/model.py to log that the data received is of type bytes:
binary_test/model/model.py
defpredict(self, model_input):# Run model inference hereprint(f"Input type: {type(model_input['byte_data'])}")return model_input
Set the content-type HTTP header to application/octet-stream
Use msgpack to encode the data or file
Make a POST request to the model
This code sample assumes you have a file Gettysburg.mp3 in the current working directory. You can download the 11-second file from our CDN or replace it with your own file.
call_model.py
import osimport requestsimport msgpackmodel_id ="MODEL_ID"# Replace with your model IDdeployment ="development"# `development`, `production`, or a deployment IDbaseten_api_key = os.environ["BASETEN_API_KEY"]# Specify the URL to which you want to send the POST requesturl =f"https://model-{model_id}.api.baseten.co/{deployment}/predict"headers={"Authorization":f"Api-Key {baseten_api_key}","content-type":"application/octet-stream",}withopen('Gettysburg.mp3','rb')asfile: response = requests.post( url, headers=headers, data=msgpack.packb({'byte_data':file.read()}))print(response.status_code)print(response.headers)
To support certain types like numpy and datetime values, you may need to extend client-side msgpack encoding with the same encoder and decoder used by Truss.
To use the output of a non-streaming model response, decode the response content.
call_model.py
# Continues `call_model.py` from abovebinary_output = msgpack.unpackb(response.content)# Change extension if not working with mp3 datawithopen('output.mp3','wb')asfile:file.write(binary_output["byte_data"])
You can also stream output as binary. This is useful for sending large files or reading binary output as it is generated.
In the model.py, you must create a streaming output.
model/model.py
# Replace the predict function in your Trussdefpredict(self, model_input):import os current_dir = os.path.dirname(__file__) file_path = os.path.join(current_dir,"tmpfile.txt")withopen(file_path, mode="wb")asfile:file.write(bytes(model_input["text"], encoding="utf-8"))defiterfile():# Get the directory of the current file current_dir = os.path.dirname(__file__)# Construct the full path to the .wav file file_path = os.path.join(current_dir,"tmpfile.txt")withopen(file_path, mode="rb")as file_like:yieldfrom file_likereturn iterfile()
Then, in your client, you can use streaming output directly without decoding.
stream_model.py
import osimport requestsimport jsonmodel_id ="MODEL_ID"# Replace with your model IDdeployment ="development"# `development`, `production`, or a deployment IDbaseten_api_key = os.environ["BASETEN_API_KEY"]# Specify the URL to which you want to send the POST requesturl =f"https://model-{model_id}.api.baseten.co/{deployment}/predict"headers={"Authorization":f"Api-Key {baseten_api_key}",}s = requests.Session()with s.post(# Endpoint for production deployment, see API reference for moref"https://model-{model_id}.api.baseten.co/{deployment}/predict", headers={"Authorization":f"Api-Key {baseten_api_key}"}, data=json.dumps({"text":"Lorem Ipsum"}),# Include stream=True as an argument so the requests libray knows to stream stream=True,)as response:for token in response.iter_content(1):print(token)# Prints bytes