Kokoro uses the following request and response format:
request:{"text": "Hello", "voice": "af", "speed": 1.0}text: str = defaults to "Hi, I'm kokoro"voice: str = defaults to "af", available options: "af", "af_bella", "af_sarah", "am_adam", "am_michael", "bf_emma", "bf_isabella", "bm_george", "bm_lewis", "af_nicole", "af_sky"speed: float = defaults to 1.0. The speed of the audio generatedresponse:{"base64": "base64 encoded bytestring"}
import httpximport base64# Replace the empty string with your model id belowmodel_id =""baseten_api_key = os.environ["BASETEN_API_KEY"]with httpx.Client()as client:# Make the API request resp = client.post(f"https://model-{model_id}.api.baseten.co/production/predict", headers={"Authorization":f"Api-Key {API_KEY}"}, json={"text":"Hello world","voice":"af","speed":1.0}, timeout=None,)# Get the base64 encoded audioresponse_data = resp.json()audio_base64 = response_data["base64"]# Decode the base64 stringaudio_bytes = base64.b64decode(audio_base64)# Write to a WAV filewithopen("output.wav","wb")as f: f.write(audio_bytes)print("Audio saved to output.wav")