Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.baseten.co/llms.txt

Use this file to discover all available pages before exploring further.

Make raw HTTP requests to Baseten management or inference APIs. The HTTP method defaults to GET, or POST when —field, —raw-field, or —input is provided. JSON responses are pretty-printed by default; non-JSON responses are streamed raw. Use —jq to filter JSON responses.

management

baseten api management [OPTIONS] <api-path>
Make raw HTTP requests to the Baseten management API (api.baseten.co). Paths are relative to /v1/, so ‘baseten api management models’ requests /v1/models.

Options

-F, --field
TEXT (repeatable)
Add a string field (key=value), parsed as JSON value
-H, --header
TEXT (repeatable)
Add a request header (key:value)
--input
TEXT
Read request body from file (use - for stdin)
-q, --jq
TEXT
Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)
-X, --method
TEXT
HTTP method, defaults to GET or POST if fields are provided
-o, --output
TEXT
default:"text"
Output formatOne of: text, json, jsonl, none
-f, --raw-field
TEXT (repeatable)
Add a raw string field (key=value)
--remote-url
TEXT
Baseten remote URL, overrides BASETEN_REMOTE_URL (default https://app.baseten.co)
-v, --verbose
BOOL
Enable verbose logging

Examples

GET a management resource
baseten api management models
POST a management resource with fields
baseten api management models --field name=my-model

Filter output with --jq

List model IDs from /v1/models
baseten api management models --jq '.models[].id'

Output

Text mode (--output text): The HTTP response body, passed through verbatim. JSON responses are pretty-printed; non-JSON responses are streamed raw to stdout. JSON mode (--output json): payload type cmd.JSONUndefined. Shape depends on the requested endpoint. See the management API OpenAPI spec at https://api.baseten.co/v1/spec.

inference

baseten api inference [OPTIONS] <api-path>
Make raw HTTP requests to a Baseten inference endpoint. Requires either —model-id or —chain-id to identify the target. Use —environment to target a specific environment (e.g. production).

Options

--chain-id
TEXT
Chain ID to target
--environment
TEXT
Environment name (e.g. production)
-F, --field
TEXT (repeatable)
Add a string field (key=value), parsed as JSON value
-H, --header
TEXT (repeatable)
Add a request header (key:value)
--input
TEXT
Read request body from file (use - for stdin)
-q, --jq
TEXT
Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)
-X, --method
TEXT
HTTP method, defaults to GET or POST if fields are provided
--model-id
TEXT
Model ID to target
-o, --output
TEXT
default:"text"
Output formatOne of: text, json, jsonl, none
-f, --raw-field
TEXT (repeatable)
Add a raw string field (key=value)
--remote-url
TEXT
Baseten remote URL, overrides BASETEN_REMOTE_URL (default https://app.baseten.co)
-v, --verbose
BOOL
Enable verbose logging

Examples

POST a predict body to a model
baseten api inference production/predict --model-id <model-id> --field prompt=hello

Filter output with --jq

Filter a JSON predict response
baseten api inference production/predict --model-id <model-id> --field prompt=hello --jq '.result'

Output

Text mode (--output text): The inference endpoint’s response body, passed through verbatim. JSON responses are pretty-printed; non-JSON responses are streamed raw. JSON mode (--output json): payload type cmd.JSONUndefined. Shape depends on the model and endpoint. See the inference API OpenAPI spec at https://api.baseten.co/inference-spec.