Skip to main content
List and inspect Baseten Model APIs. Authentication is via ‘baseten auth login’ or the BASETEN_API_KEY environment variable.

fetch

baseten model-api fetch [OPTIONS]
Fetch a single Model API by name.

Options

-q, --jq
TEXT
Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)
--model
TEXT
required
Name of the Model API to fetch.
-o, --output
TEXT
default:"text"
Output formatOne of: text, json, jsonl, none
--profile
TEXT
Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile
-v, --verbose
BOOL
Enable verbose logging

Examples

Fetch a Model API by name
baseten model-api fetch --model <name>

Filter output with --jq

Print the Model API’s invoke URL
baseten model-api fetch --model <name> --jq '.invoke_url'

Output

Text mode (--output text): Field-per-line summary of the Model API. JSON mode (--output json): payload type managementapi.ModelAPI.

list

baseten model-api list [OPTIONS]
List the Model APIs the workspace has added. Pass —all to browse the full visible catalog instead of just the added ones.

Options

--all
BOOL
Browse the full visible catalog instead of only the Model APIs the workspace has added.
-q, --jq
TEXT
Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)
-o, --output
TEXT
default:"text"
Output formatOne of: text, json, jsonl, none
--profile
TEXT
Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile
-v, --verbose
BOOL
Enable verbose logging

Examples

List the Model APIs the workspace has added
baseten model-api list
Browse the full visible catalog
baseten model-api list --all

Filter output with --jq

Print just the Model API names
baseten model-api list --jq '.items[].name'

Output

Text mode (--output text): Table with columns: NAME, CONTEXT, /1MIN,/1M IN, /1M OUT, ADDED. When no Model APIs match, prints “No Model APIs found.” to stderr. JSON mode (--output json): payload type cmd.ModelAPIList.

predict

baseten model-api predict [OPTIONS]
POST an inference request to a Model API and write the response to stdout. The request is sent to —url, which defaults to the OpenAI chat-completions endpoint on the shared inference host. Override it for other shapes (e.g. /v1/messages, /v1/embeddings) or different hosts. —content is the simple path: it builds an OpenAI chat-completions body with a single user message and —model as the model, and prints just the assistant’s reply. It is only valid for OpenAI chat URLs and requires —model. —data and —file send a request body verbatim, so any format the endpoint accepts works (OpenAI, Anthropic, embeddings, custom). The response is written as-is: JSON is pretty-printed, streams and binary bodies are passed through.

Options

--content
TEXT
Single user message; builds an OpenAI chat-completions request and prints the assistant’s reply. Only valid for OpenAI chat URLs and requires —model.Mutually exclusive with other flags in group predict-input.
--data
TEXT
Inline request body, sent verbatim.Mutually exclusive with other flags in group predict-input.
--file
TEXT
Path to a file containing the request body, sent verbatim. Use ’-’ for stdin.Mutually exclusive with other flags in group predict-input.
-q, --jq
TEXT
Filter JSON output with a jq expression; implies —output json (or jsonl for streamed commands)
--model
TEXT
Name of the Model API. Required with —content, where it sets the request’s model.
-o, --output
TEXT
default:"text"
Output formatOne of: text, json, jsonl, none
--profile
TEXT
Use a specific stored profile for this command, overriding BASETEN_PROFILE and the current profile
--url
TEXT
Endpoint to POST the request to. Defaults to https://inference.baseten.co/v1/chat/completions.
-v, --verbose
BOOL
Enable verbose logging

Examples

Send a single user message
baseten model-api predict --model <name> --content "hello"
Send a full OpenAI-shaped body and stream it as JSONL
baseten model-api predict --model <name> --data '{"model":"<name>","messages":[{"role":"user","content":"hi"}],"stream":true}' --output jsonl

Filter output with --jq

Extract the assistant’s message content
baseten model-api predict --model <name> --content "hi" --jq '.choices[0].message.content'

Output

Text mode (--output text): With —content, the assistant message text. With —data/—file, the response body as-is (pretty-printed JSON, or a raw stream/binary body). JSON mode (--output json): payload type cmd.JSONUndefined. Under —output json, —content emits the full chat-completions response. For —data/—file, a streamed response becomes one JSON record per chunk under —output jsonl, and a binary body is base64-encoded under a ‘body’ key.