Once deployed, your model is accessible via an API endpoint. To make an inference request, you’ll need:

  • Model ID
  • An API key for your Baseten account.
  • JSON-serializable model input

API Endpoints

Baseten provides multiple endpoints for different inference modes:

  • /predict – Standard synchronous inference.
  • /async_predict – Asynchronous inference for long-running tasks.

Endpoints are available for environments and all deployments. See the API reference for details.

Alternative Invocation Methods

  • Truss CLI: truss predict
  • Model Dashboard: “Playground” button in the Baseten UI