Llama is OpenAI compatible and can be called using the OpenAI client.
Copy
Ask AI
import osfrom openai import OpenAI# https://model-XXXXXXX.api.baseten.co/environments/production/sync/v1model_url = ""client = OpenAI( base_url=model_url, api_key=os.environ.get("BASETEN_API_KEY"),)stream = client.chat.completions.create( model="baseten", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What was the role of Llamas in the Inca empire?"} ], stream=True,)for chunk in stream: if chunk.choices[0].delta.content is not None: print(chunk.choices[0].delta.content, end="")
JSON Output
Copy
Ask AI
["streaming", "output", "text"]
Assistant
Responses are generated using AI and may contain mistakes.