If you need a deployed model to try the invocation examples below, follow these steps to create and deploy a super basic Truss that accepts and returns binary data. The Truss performs no operations and is purely illustrative.
1
Create a Truss
To create a Truss, run:
Copy
Ask AI
truss init binary_test
This creates a Truss in a new directory binary_test. By default, newly created Trusses implement an identity function that returns the exact input they are given.
2
Add logging
Optionally, modify binary_test/model/model.py to log that the data received is of type bytes:
binary_test/model/model.py
Copy
Ask AI
def predict(self, model_input): # Run model inference here print(f"Input type: {type(model_input['byte_data'])}") return model_input
Set the content-type HTTP header to application/octet-stream
Use msgpack to encode the data or file
Make a POST request to the model
This code sample assumes you have a file Gettysburg.mp3 in the current working directory. You can download the 11-second file from our CDN or replace it with your own file.
call_model.py
Copy
Ask AI
import osimport requestsimport msgpackmodel_id = "MODEL_ID" # Replace with your model IDdeployment = "development" # `development`, `production`, or a deployment IDbaseten_api_key = os.environ["BASETEN_API_KEY"]# Specify the URL to which you want to send the POST requesturl = f"https://model-{model_id}.api.baseten.co/{deployment}/predict"headers={ "Authorization": f"Api-Key {baseten_api_key}", "content-type": "application/octet-stream",}with open('Gettysburg.mp3', 'rb') as file: response = requests.post( url, headers=headers, data=msgpack.packb({'byte_data': file.read()}) )print(response.status_code)print(response.headers)
To support certain types like numpy and datetime values, you may need to
extend client-side msgpack encoding with the same encoder and decoder used
by
Truss.
To use the output of a non-streaming model response, decode the response content.
call_model.py
Copy
Ask AI
# Continues `call_model.py` from abovebinary_output = msgpack.unpackb(response.content)# Change extension if not working with mp3 datawith open('output.mp3', 'wb') as file: file.write(binary_output["byte_data"])
You can also stream output as binary. This is useful for sending large files or reading binary output as it is generated.
In the model.py, you must create a streaming output.
model/model.py
Copy
Ask AI
# Replace the predict function in your Trussdef predict(self, model_input): import os current_dir = os.path.dirname(__file__) file_path = os.path.join(current_dir, "tmpfile.txt") with open(file_path, mode="wb") as file: file.write(bytes(model_input["text"], encoding="utf-8")) def iterfile(): # Get the directory of the current file current_dir = os.path.dirname(__file__) # Construct the full path to the .wav file file_path = os.path.join(current_dir, "tmpfile.txt") with open(file_path, mode="rb") as file_like: yield from file_like return iterfile()
Then, in your client, you can use streaming output directly without decoding.
stream_model.py
Copy
Ask AI
import osimport requestsimport jsonmodel_id = "MODEL_ID" # Replace with your model IDdeployment = "development" # `development`, `production`, or a deployment IDbaseten_api_key = os.environ["BASETEN_API_KEY"]# Specify the URL to which you want to send the POST requesturl = f"https://model-{model_id}.api.baseten.co/{deployment}/predict"headers={ "Authorization": f"Api-Key {baseten_api_key}",}s = requests.Session()with s.post( # Endpoint for production deployment, see API reference for more f"https://model-{model_id}.api.baseten.co/{deployment}/predict", headers={"Authorization": f"Api-Key {baseten_api_key}"}, data=json.dumps({"text": "Lorem Ipsum"}), # Include stream=True as an argument so the requests libray knows to stream stream=True,) as response: for token in response.iter_content(1): print(token) # Prints bytes