SDKs & guides / Migrate to ZeliSpeech

Migrate to ZeliSpeech

Moving an existing text-to-speech integration to ZeliSpeech is a two-line change: base_url and the API key. The ZeliSpeech request/response shapes follow mainstream TTS-API conventions, so most integrations move with no other code changes — with secure data sovereignty, your text and audio never leave your own infrastructure.

The two-line change

from zeli_tts import ZeliSpeech
 
client = ZeliSpeech(
    api_key="sk-zeli-...",                       # your ZELI_API_KEY
    base_url="https://voice.your-domain.com",    # your Zeli box — NO trailing /v1
)
# every existing call works unchanged

import { ZeliSpeech } from "zelispeech";
 
const client = new ZeliSpeech({
  apiKey: "sk-zeli-...",
  baseUrl: "https://voice.your-domain.com", // NO trailing /v1
});

Set the "API base URL" field to  https://voice.your-domain.com
Set the "API key" field to       your ZELI_API_KEY
Leave the voice id as-is          (unknown ids alias to your default voice)

What carries over unchanged

Endpointsfamiliar shapesOptional

/v1/text-to-speech/{id}, /stream, /with-timestamps, the stream-input WebSocket, /v1/voices, /v1/models, /v1/user*.

output_formatfull menuOptional

The full mp3 / pcm / μ-law / A-law / opus / wav enum — with no tier gating. See Output formats.

voice_settingsaccepted + mappedOptional

stability / style / speed map onto Turbo delivery; unused sliders are accepted as no-ops. See Voice settings.

model_idaccepted + ignoredOptional

Keep a hard-coded model string — it resolves against the models list and runs on Zeli Turbo.

Error shapestwo envelopesOptional

A detail-object for logical errors and a detail-array for validation errors. See Errors.

What's different (by design)

Behavioral differences to know

Voice ids — an unknown voice id aliases to your default voice instead of 404-ing (set ZELI_EL_STRICT_VOICES=1 for strict resolution). The resolved voice is in the x-zeli-voice header.
Timbre sliders — similarity_boost / use_speaker_boost are no-ops; timbre is fixed by the reference clip (zero-shot cloning).
Character alignment — approximate on HTTP, null on the WebSocket. See Character timestamps.
Concurrency — the engine is batch-1; scale with more GPU workers behind a router. Per-key limits and 429 are on the roadmap.

Not implemented yet

Speech-to-speech / voice conversion, sound-effects / music, dubbing, projects / history, pronunciation-dictionary CRUD, the multi-stream-input WebSocket variant, and real character alignment. The TTS convert/stream surface (plus voices / models / user) is complete.

Migration checklist

Go-live steps for a public box

Set ZELI_API_KEY (or ZELI_API_KEYS) on the box.
Optionally pin the swap-out voice: ZELI_EL_DEFAULT_VOICE=zeli-voice-1.
Terminate TLS in front (an ALB + ACM cert, or Caddy / nginx). For the WebSocket, forward the Upgrade / Connection headers.
Point your client's base URL at https://voice.your-domain.com.
Smoke-test: a convert, a stream, and a stream-input call.

Then you're done — the same integration, now running on hardware you control.

Migrate to ZeliSpeech#

The two-line change#

What carries over unchanged#

What's different (by design)#

Not implemented yet#