Model inference
How to call your model
Run inference on deployed models
Once you’ve deployed your model, it’s time to use it! Every model on Baseten is served behind an API endpoint. To call a model, you need:
- The model’s ID.
- An API key for your Baseten account.
- JSON-serializable model input.
You can call a model using:
- Its
predict
endpoint for the production deployment, development deployment or other published deployment. - The Truss CLI command
truss predict
. - The “call model” modal in the model dashboard on Baseten.
Call by API endpoint
import urllib3
import os
model_id = ""
# Read secrets from environment variables
baseten_api_key = os.environ["BASETEN_API_KEY"]
resp = urllib3.request(
"POST",
f"https://model-{model_id}.api.baseten.co/production/predict",
headers={"Authorization": f"Api-Key {baseten_api_key}"},
json={}, # JSON-serializable model input
)
print(resp.json())
See the Baseten API reference for more details.
Call with Truss CLI
truss predict --model $MODEL_ID -d '$MODEL_INPUT'
See the Truss CLI reference for more details.