Messages
Create Anthropic Messages API requests against Baseten Model APIs.
https://inference.baseten.co/v1/messages.
Call with the Anthropic SDK
The Anthropic SDK sends the API key asx-api-key by default. Baseten reads Authorization, so override default_headers when creating the client:
- Python
- cURL
Authorizations
Pass your Baseten API key. Clients automatically send Authorization: Bearer <key>. Direct callers can also use Authorization: Api-Key <key>; both schemes are accepted. The Anthropic SDK's default x-api-key header is not accepted; override default_headers to send Authorization instead.
Body
Request body for creating a message.
The model slug to use. Find available models at Model APIs.
The conversation history as an ordered list of input messages. Alternating user and assistant roles are expected; the final message must be from the user.
The maximum number of tokens to generate in the response. Required by the Messages API. The response may be shorter if it finishes naturally or hits a stop sequence.
x >= 1A system prompt that sets the model's behavior. Pass either a single string or an array of text content blocks.
Controls randomness. Lower values are more deterministic. Range: 0 to 1.
0 <= x <= 1Nucleus sampling: only consider tokens with cumulative probability up to this value.
x <= 1Limits token selection to the top K most probable tokens at each step.
x >= 0Custom text sequences that will stop generation. When a stop sequence is hit, stop_reason is stop_sequence and stop_sequence contains the matched string.
If true, the response is streamed as server-sent events. Each event has a type such as message_start, content_block_delta, or message_stop.
A list of tools the model may call. Each tool has a name, description, and input_schema (a JSON Schema object).
Controls which tool (if any) the model must call.
- ToolChoice
- ToolChoice
- ToolChoice
- ToolChoice
An object describing metadata about the request. Supports user_id for abuse detection.
Response
Successful response
The message response returned by the model.
A unique identifier for this message, such as msg_abc123.
The object type, always message.
"message"The role of the generated message, always assistant.
"assistant"An array of content blocks generated by the model. Text responses contain a single text block; responses that invoke tools contain tool_use blocks.
A text content block.
- TextBlock
- ToolUseBlock
The model slug that produced the response.
Why the model stopped generating: end_turn (natural stop), max_tokens (hit the max_tokens limit), stop_sequence (matched a stop_sequences entry), or tool_use (model invoked a tool).
end_turn, max_tokens, stop_sequence, tool_use Token usage statistics for the request.
The stop sequence that was matched, if stop_reason is stop_sequence. Otherwise null.