Model inference
How to call your model
Run inference on deployed models
Once you’ve deployed your model, it’s time to use it! Every model on Baseten is served behind an API endpoint. To call a model, you need:
- The model’s ID.
- An API key for your Baseten account.
- JSON-serializable model input.
You can call a model using:
- Its
predict
endpoint for the production deployment, development deployment or other published deployment. - The Truss CLI command
truss predict
. - The “call model” modal in the model dashboard on Baseten.
Call by API endpoint
Production deployment
Development deployment
Development deployment
import urllib3
import os
model_id = ""
# Read secrets from environment variables
baseten_api_key = os.environ["BASETEN_API_KEY"]
resp = urllib3.request(
"POST",
f"https://model-{model_id}.api.baseten.co/production/predict",
headers={"Authorization": f"Api-Key {baseten_api_key}"},
json={}, # JSON-serializable model input
)
print(resp.json())
See the Baseten API reference for more details.
Call with Truss CLI
By model ID
By deployment ID
By working directory
truss predict --model $MODEL_ID -d '$MODEL_INPUT'
See the Truss CLI reference for more details.