baseten model-api

List and inspect Baseten Model APIs. Authenticate with baseten auth login or the BASETEN_API_KEY environment variable.

describe

baseten model-api describe [OPTIONS]

Describe a single Model API by name.

Options

TEXT

Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)

TEXT

required

Name of the Model API to describe.

TEXT

default:"text"

Output formatOne of: text, json, jsonl, none

TEXT

Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile

BOOL

Enable verbose logging

Examples

Describe a Model API by name

baseten model-api describe --model <name>

Filter output with `--jq`

Print the Model API’s invoke URL

baseten model-api describe --model <name> --jq '.invoke_url'

Output

Text mode (--output text): Field-per-line summary of the Model API. JSON mode (--output json): payload type managementapi.ModelAPI.

list

baseten model-api list [OPTIONS]

List the Model APIs in the full visible catalog. Pass --added-only to restrict to just the Model APIs the workspace has added.

CLI v0.3.0 removed --all and changed the default: baseten model-api list now returns the full catalog instead of just added Model APIs. Scripts that relied on the old default should pass --added-only.

Options

BOOL

Restrict to the Model APIs the workspace has added instead of the full visible catalog.

TEXT

Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)

TEXT

default:"text"

Output formatOne of: text, json, jsonl, none

TEXT

Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile

BOOL

Enable verbose logging

Examples

List the full visible catalog of Model APIs

baseten model-api list

List only the Model APIs the workspace has added

baseten model-api list --added-only

Filter output with `--jq`

Print just the Model API names

baseten model-api list --jq '.items[].name'

Output

Text mode (--output text): Table with columns: NAME, CONTEXT,

/1M IN,

/1M OUT, ADDED. When no Model APIs match, prints “No Model APIs found.” to stderr. JSON mode (--output json): payload type cmd.ModelAPIList.

predict

baseten model-api predict [OPTIONS]

POST an inference request to a Model API and write the response to stdout. The request is sent to --url, which defaults to the OpenAI chat-completions endpoint on the shared inference host. Override it for other shapes (e.g. /v1/messages, /v1/embeddings) or different hosts. --content is the simple path: it builds an OpenAI chat-completions body with a single user message and --model as the model, and prints just the assistant’s reply. It is only valid for OpenAI chat URLs and requires --model. --data and --file send a request body verbatim, so any format the endpoint accepts works (OpenAI, Anthropic, embeddings, custom). The response is written as-is: JSON is pretty-printed, streams and binary bodies are passed through.

Options

TEXT

Single user message; builds an OpenAI chat-completions request and prints the assistant’s reply. Only valid for OpenAI chat URLs and requires —model.Mutually exclusive with other flags in group predict-input.

TEXT

Inline request body, sent verbatim.Mutually exclusive with other flags in group predict-input.

TEXT

Path to a file containing the request body, sent verbatim. Use ’-’ for stdin.Mutually exclusive with other flags in group predict-input.

TEXT

Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)

TEXT

Name of the Model API. Required with —content, where it sets the request’s model.

TEXT

default:"text"

Output formatOne of: text, json, jsonl, none

TEXT

Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile

TEXT

Endpoint to POST the request to. Defaults to https://inference.baseten.co/v1/chat/completions.

BOOL

Enable verbose logging

Examples

Send a single user message

baseten model-api predict --model <name> --content "hello"

Send a full OpenAI-shaped body and stream it as JSONL

baseten model-api predict --model <name> --data '{"model":"<name>","messages":[{"role":"user","content":"hi"}],"stream":true}' --output jsonl

Filter output with `--jq`

Extract the assistant’s message content

baseten model-api predict --model <name> --content "hi" --jq '.choices[0].message.content'

Output

Text mode (--output text): With --content, the assistant message text. With --data/--file, the response body as-is (pretty-printed JSON, or a raw stream/binary body). JSON mode (--output json): payload type cmd.JSONUndefined. Under --output json, --content emits the full chat-completions response. For --data/--file, a streamed response becomes one JSON record per chunk under --output jsonl, and a binary body is base64-encoded under a ‘body’ key.

Reference

Inference API

Management API

CLI reference

SDK reference

Training API

Frontier Gateway API

CI/CD

baseten model-api

describe

Options

Examples

Filter output with `--jq`

Output

list

Options

Examples

Filter output with `--jq`

Output

predict

Options

Examples

Filter output with `--jq`

Output

​describe

​Options

​Examples

​Filter output with --jq

​Output

​list

​Options

​Examples

​Filter output with --jq

​Output

​predict

​Options

​Examples

​Filter output with --jq

​Output

describe

Options

Examples

Filter output with `--jq`

Output

list

Options

Examples

Filter output with `--jq`

Output

predict

Options

Examples

Filter output with `--jq`

Output