Capabilities / Voices
Voices
Choose from the built-in Zeli voices, clone your own from a short reference
clip, or keep a legacy voice_id carried over from a previous
provider in place — unknown ids alias to your default voice so a migrating
client just works.
Built-in voices
Pass any of these as the URL voice_id. Fetch the live list any time from
GET /v1/voices.
| Voice id | Character |
|---|---|
zeli-voice-1 | Warm, professional Australian woman — calm, confident, conversational |
zeli-voice-2 | Warm, friendly young woman, relaxed American lilt — a good all-rounder |
zeli-voice-3 | Clear, articulate — measured, literary reading style for longer reads |
zeli-voice-4 | Relaxed, easy-going man — casual, unhurried, approachable |
zeli-voice-5 | Warm, low-pitched man — unhurried storytelling delivery |
zeli-voice-6 | Neutral built-in default |
Each voice is pinned to a reference clip (zero-shot cloning), so the same voice
comes back on every call. Timbre is fixed by that clip — the
similarity_boost / use_speaker_boost sliders are
accepted but have no effect.
How a voice_id resolves
The URL voice_id is resolved in this order — the last step is what makes the
swap-out seamless:
A native id (zeli-voice-1, zeli-voice-2, …) or a
custom-… id from /v1/voices is used exactly.
ZELI_EL_VOICE_ALIASES="legacy-id-a=zeli-voice-1,legacy-id-b=zeli-voice-4"
maps specific legacy voice ids to your voices.
Well-known legacy voice ids / names carried over from a previous provider map to a comparable Zeli voice.
Anything else becomes the default voice (ZELI_EL_DEFAULT_VOICE,
else the box default) instead of a 404.
The resolved voice is returned in the x-zeli-voice response header. To restore
strict voice resolution (404 on unknown ids), set ZELI_EL_STRICT_VOICES=1.
Leave your legacy voice ids in the calling code. Unknown ones alias to your default voice, so every call succeeds — then switch to explicit Zeli voice ids when you're ready.
Cloning a voice
Add your own voice zero-shot from a short reference clip (~10–20 s of clean,
single-speaker audio). It's then indexed (survives restarts), mirrored across
regions if configured, and usable as a voice_id.
from zeli_tts import ZeliSpeech
client = ZeliSpeech(api_key="sk-zeli-...", base_url="https://voice.your-domain.com")
voice = client.voices.add(name="My voice", file="me.wav", description="brand read")
print(voice.id) # custom-… — now usable as voice_idcurl -X POST "https://voice.your-domain.com/voices" \
-F "name=My voice" \
-F "file=@me.wav"With ZELI_VOICES_S3 configured, the voice store mirrors to a single
S3 bucket pinned to a home region — voices are identical in every region and
the data stays put while compute roams. Fully fail-open: no S3 ⇒ local-only.
Listing voices
GET /v1/voices and /v2/voices return the
ZeliSpeech-shaped voice list; /v1/voices/{id} returns one. Use these to
populate a picker exactly as any mainstream TTS client would.