SDKs & guides / Migrate to ZeliSpeech
Migrate to ZeliSpeech
Moving an existing text-to-speech integration to ZeliSpeech is a two-line
change: base_url and the API key. The ZeliSpeech request/response
shapes follow mainstream TTS-API conventions, so most integrations move with no
other code changes — with secure data sovereignty, your text and audio never
leave your own infrastructure.
The two-line change
from zeli_tts import ZeliSpeech
client = ZeliSpeech(
api_key="sk-zeli-...", # your ZELI_API_KEY
base_url="https://voice.your-domain.com", # your Zeli box — NO trailing /v1
)
# every existing call works unchangedimport { ZeliSpeech } from "zelispeech";
const client = new ZeliSpeech({
apiKey: "sk-zeli-...",
baseUrl: "https://voice.your-domain.com", // NO trailing /v1
});Set the "API base URL" field to https://voice.your-domain.com
Set the "API key" field to your ZELI_API_KEY
Leave the voice id as-is (unknown ids alias to your default voice)What carries over unchanged
/v1/text-to-speech/{id}, /stream,
/with-timestamps, the stream-input WebSocket,
/v1/voices, /v1/models, /v1/user*.
The full mp3 / pcm / μ-law / A-law / opus / wav enum — with no tier gating. See Output formats.
stability / style / speed map onto Turbo
delivery; unused sliders are accepted as no-ops. See
Voice settings.
Keep a hard-coded model string — it resolves against the models list and runs on Zeli Turbo.
A detail-object for logical errors and a detail-array for validation errors. See Errors.
What's different (by design)
Voice ids — an unknown voice id aliases to your default voice instead of 404-ing (set
ZELI_EL_STRICT_VOICES=1for strict resolution). The resolved voice is in thex-zeli-voiceheader.Timbre sliders —
similarity_boost/use_speaker_boostare no-ops; timbre is fixed by the reference clip (zero-shot cloning).Character alignment — approximate on HTTP,
nullon the WebSocket. See Character timestamps.Concurrency — the engine is batch-1; scale with more GPU workers behind a router. Per-key limits and
429are on the roadmap.
Not implemented yet
Speech-to-speech / voice conversion, sound-effects / music, dubbing, projects /
history, pronunciation-dictionary CRUD, the multi-stream-input WebSocket
variant, and real character alignment. The TTS convert/stream surface (plus
voices / models / user) is complete.
Migration checklist
- Set
ZELI_API_KEY(orZELI_API_KEYS) on the box. - Optionally pin the swap-out voice:
ZELI_EL_DEFAULT_VOICE=zeli-voice-1. - Terminate TLS in front (an ALB + ACM cert, or Caddy / nginx). For the WebSocket, forward the
Upgrade/Connectionheaders. - Point your client's base URL at
https://voice.your-domain.com. - Smoke-test: a
convert, astream, and astream-inputcall.
Then you're done — the same integration, now running on hardware you control.