Text to Speech
Convert text to speech
This endpoint only accepts application/json
and application/msgpack
.
For best results, upload reference audio using the create model before using this one. This improves speech quality and reduces latency.
To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the instructions.
Audio formats supported:
- WAV / PCM
- Sample Rate: 8kHz, 16kHz, 24kHz, 32kHz, 44.1kHz
- Default Sample Rate: 44.1kHz
- 16-bit, mono
- MP3
- Sample Rate: 32kHz, 44.1kHz
- Default Sample Rate: 44.1kHz
- mono
- Bitrate: 64kbps, 128kbps (default), 192kbps
- Opus
- Sample Rate: 48kHz
- Default Sample Rate: 48kHz
- mono
- Bitrate: -1000 (auto), 24kbps, 32kbps (default), 48kbps, 64kbps
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Headers
Specify which TTS model to use
speech-1.5
, speech-1.6
, agent-x0
Body
Text to be converted to speech
Controls randomness in the speech generation. Higher values (e.g., 1.0) make the output more random, while lower values (e.g., 0.1) make it more deterministic
0 <= x <= 1
Controls diversity via nucleus sampling. Lower values (e.g., 0.1) make the output more focused, while higher values (e.g., 1.0) allow more diversity
0 <= x <= 1
References to be used for the speech, this requires MessagePack serialization, this will override reference_voices and reference_texts
ID of the reference model o be used for the speech
Prosody to be used for the speech
Chunk length to be used for the speech
100 <= x <= 300
Whether to normalize the speech, this will reduce the latency but may reduce performance on numbers and dates
Format to be used for the speech
wav
, pcm
, mp3
, opus
Sample rate to be used for the speech
MP3 Bitrate to be used for the speech
64
, 128
, 192
Opus Bitrate to be used for the speech
-1000
, 24
, 32
, 48
, 64
Latency to be used for the speech, balanced will reduce the latency but may lead to performance degradation
normal
, balanced
Response
Request fulfilled, document follows