API Docs
Studio plans include API access. Use an API key created in your account settings.
Authentication
Include your API key in the X-API-Key header.
X-API-Key: <your_api_key>
POST /api/tts
Generate speech from text. Returns raw audio bytes with the appropriate Content-Type header.
{
"text": "Hello world",
"voice": "Lauren",
"format": "mp3",
"speed": 1.0,
"pitch": 1.0,
"sampleRate": 24000
}Parameters
| Field | Type | Required | Description |
|---|---|---|---|
text | string | Yes | The text to convert to speech (max 10,000 chars on Studio). |
voice | string | No | Voice ID. Defaults to "Lauren". See voice list below. |
format | string | No | "mp3" (default) or "wav". |
speed | number | No | Speech speed, 0.5 to 2.0. Default 1.0. |
pitch | number | No | Pitch adjustment, 0.5 to 2.0. Default 1.0. |
sampleRate | number | No | Audio sample rate, 16000 to 48000. Default 24000. |
Response
On success, returns 200 with raw audio bytes. The Content-Type header is audio/mpeg for MP3 or audio/wav for WAV. The X-Chars-Used header shows how many characters were consumed.
Using Cloned Voices
Studio plan users with cloned voices can use them via the API. Use the clone-{name} format, where {name} is the name you gave your clone (lowercase, hyphens for spaces).
// If your clone is named "Joshua":
{
"text": "Hello from my cloned voice",
"voice": "clone-joshua"
}
// If your clone is named "My Narrator":
{
"text": "Welcome to the show",
"voice": "clone-my-narrator"
}You can also use the clone UUID returned when the voice was created: clone-{uuid}. Find your clone voice IDs in your account dashboard under Voice Cloning.
Audio Markup (Delivery Style)
Pro and Studio plans can use audio markup tags to control delivery style. Place one tag at the start of your text for best results. Tags are not counted toward your character limit. English only. This is an experimental feature.
// Emotion / delivery tags (place at start of text):
[happy] [sad] [angry] [excited] [whisper]
[calm] [scared] [friendly] [professional] [sarcastic]
// Example request:
{
"text": "[excited] Check out this amazing new feature!",
"voice": "Sarah"
}Tip: For best results, use one tag per request. If you need different emotions for different parts, split them into separate API calls.
Sound Effects
Insert non-verbal sounds inline in your text. Sound effects are available on all plans and produce actual audio in the output.
// Available sound effect tags:
[laughing] [cough] [sigh] [gasp] [groan]
[yawn] [sniff] [clap] [shush] [hmm]
[ugh] [cheer]
// Example:
{
"text": "[laughing] That was hilarious! [sigh] Okay, back to work.",
"voice": "Alex"
}Word Timestamps / SRT Subtitles
Pro and Studio plans can request word-level timestamps. The response includes an X-Timestamps header with JSON-encoded timing data, and an SRT subtitle file can be downloaded alongside the audio from the web interface.
// To request timestamps, add the header:
X-Include-Timestamps: true
// The response will include:
// - Audio bytes in the body
// - X-Timestamps header with JSON array of word timings
// - X-Chars-Used header with character count
// Timestamp format:
[
{ "word": "Hello", "start": 0.0, "end": 0.42 },
{ "word": "world", "start": 0.45, "end": 0.87 }
]Available Voices
Voices are organized by tier. Your plan determines which voices you can access via the API. Use the voice name as the voice parameter.
| Tier | Plan Required | Example Voices |
|---|---|---|
| Free | All plans | Lauren, Elliot, Blake, Luna, Mia, Oliver, Ethan, Claire, Dennis, Sophie |
| Starter | Starter+ | Alex, Timothy, Ashley, Liam, Chloe, Carter, Jessica, Julia, Priya, Hana |
| Pro | Pro+ | Theodore, Shaun, Edward, Deborah, Wendy, Craig, Elizabeth, James, Felix, Serena |
| Studio | Studio only | Sarah, Hunter (Mark), Benedict (Clive), Dominus, Hades, Pixie, + all non-English voices |
For the full voice list with language, gender, and accent metadata, call GET /api/voice-metadata (public, no auth required).
Rate Limits
| Limit | Value |
|---|---|
| Requests per minute | 30 |
| Max characters per request | 20,000 (Studio) |
| Monthly character limit | Per plan (750K for Studio). Top-ups available. |
| Concurrent requests | 5 |
If you hit a rate limit, the API returns 429 with a Retry-After header.
Voice Design via API
Voice Design (creating a custom voice from a text description) is currently available through the web interface only. Designed voices are automatically saved to your account and can then be used via the API using the clone-{name} voice ID format, just like cloned voices. Voice design shares the same 3-slot limit as voice cloning.
Error Codes
| Status | Error | Description |
|---|---|---|
| 400 | CloneNotFound | The requested clone voice was not found in your account. |
| 401 | Unauthorized | Missing or invalid API key. |
| 402 | PaymentRequired | Feature requires a higher plan (e.g., cloning on Studio). |
| 403 | Forbidden | Access denied (banned, wrong owner, etc.). |
| 429 | LimitExceeded | Monthly character limit or rate limit exceeded. |
| 500 | TTSProviderError | TTS service error. Try again or use a different voice. |