Kokoro TTS

State-of-the-art AI Text-to-Speech Model

Key Features

82M Parameters

Efficient model with only 82 million parameters, outperforming larger models.

Multiple Voicepacks

10 unique voicepacks available, with more to come.

#1 Ranked Model

Topped the TTS Spaces Arena, outperforming models with more parameters and data.

Quick Start

OpenAI-Compatible Speech Endpoint

Using OpenAI's Python library

from openai import OpenAI
client = OpenAI(base_url="https://api.kokorotts.com/v1", api_key="not-needed")
response = client.audio.speech.create(
    model="kokoro",  # Not used but required for compatibility, also accepts library defaults
    voice="af_bella+af_sky",
    input="Hello world!",
    response_format="mp3"
)

response.stream_to_file("output.mp3")

Using Requests

import requests

response = requests.post(
    "https://api.kokorotts.com/v1/audio/speech",
    json={
        "model": "kokoro",  # Not used but required for compatibility
        "input": "Hello world!",
        "voice": "af_bella",
        "response_format": "mp3",  # Supported: mp3, wav, opus, flac
        "speed": 1.0
    }
)

# Save audio
with open("output.mp3", "wb") as f:
    f.write(response.content)