SDKs & guides / Python SDK

Python SDK

The ZeliSpeech Python SDK is a small, dependency-light client for the ZeliSpeech API. Point it at your server and get audio back in a couple of lines — text_to_speech.convert(...) / .stream(...) plus play / save / stream helpers.

One SDK, one contract

The ZeliSpeech SDK talks the same /v1 API documented throughout the API reference. The request/response shapes follow mainstream TTS-API conventions, so the calls will already look familiar.

Install

pip install zeli-tts

Runtime deps: requests (HTTP) and websocket-client (streaming). Live playback (play / stream) shells out to ffplay (ffmpeg), mpv, or afplay.

Quickstart

from zeli_tts import ZeliSpeech, play, save, stream
 
client = ZeliSpeech(
    api_key="sk-zeli-...",                       # optional if the box runs open
    base_url="https://voice.your-domain.com",
)
 
# 1) Stream — first audio after the first sentence, not the whole passage
audio = client.text_to_speech.stream(
    voice_id="zeli-voice-1",
    text="Hey there, this is Zeli.",
    output_format="mp3_44100_128",
)
for chunk in audio:
    ...                # audio bytes as they generate
 
# 2) Stream + play live (needs ffplay or mpv)
stream(client.text_to_speech.stream(voice_id="zeli-voice-1", text="Playing as I generate."))
 
# 3) One-shot — a finished clip
audio = client.text_to_speech.convert(
    voice_id="zeli-voice-1",
    text="Hello world.",
    output_format="mp3_44100_128",
)
save(audio, "hello.mp3")

Client

ZeliSpeech(base_url, *, api_key=None, timeout=60.0)

base_urlstringRequired

http://host:8000 or https://voice.your-domain.com; the ws(s):// URL is derived for streaming. Do not include a trailing /v1.

api_keystringOptional

Sent as an Authorization: Bearer token on every request. Required only when the box sets ZELI_API_KEY.

timeoutfloatOptionaldefault: 60.0

Per-request timeout in seconds.

Synthesis — `client.text_to_speech`

convert(voice_id, text, *, model_id="zeli-turbo", voice_settings=None,
        output_format="mp3_44100_128") -> bytes
 
stream(voice_id, text, *, model_id="zeli-turbo", voice_settings=None,
       output_format="mp3_44100_128") -> AudioStream

convert(...) returns the finished audio as bytes in the requested output_format.
stream(...) returns an AudioStream — iterate it for audio bytes as they generate. .read() drains it to a single blob.
output_format accepts the full menu — see Output formats.
voice_settings maps onto Turbo delivery — see Voice settings.

Voices — `client.voices`

client.voices.list()                 # -> [Voice(id, label, description, custom), ...]
client.voices.add(name="My voice", file="me.wav", description="")  # -> Voice
client.voices.delete("custom-abc123")

add clones a voice zero-shot from a short reference clip (~10–20 s of clean single-speaker audio; file may be a path, a file object, or bytes).

Helpers

from zeli_tts import play, save, stream
 
play(audio)                    # play a finished clip
stream(audio_stream)           # play chunks live; returns the collected audio
save(audio, "out.mp3")         # write bytes to a file

Errors

All derive from ZeliSpeechError:

Exception	When
`ConfigurationError`	bad `base_url`, or a missing dep / player
`ConnectionError`	server unreachable (refused / DNS / timeout)
`APIError`	non-2xx from an HTTP endpoint (`.status_code`, `.body`)
`GenerationError`	server sent an `error` event mid-synthesis

Roadmap

Sync client first; an async variant (websockets + httpx) follows. A published zeli-tts (Python) + a TS/npm zelispeech SDK are on the near-term roadmap.

Python SDK#

Install#

Quickstart#

Client#

Synthesis — client.text_to_speech#

Voices — client.voices#

Helpers#

Errors#

Python SDK

Install

Quickstart

Client

Synthesis — `client.text_to_speech`

Voices — `client.voices`

Helpers

Errors