Text to Speech
This endpoint only accepts application/json
and application/msgpack
.
For best results, upload reference audio using the create model before using this one. This improves speech quality and reduces latency.
To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the instructions.
Audio formats supported:
- WAV / PCM (16-bit, 44100 Hz, mono)
- MP3 (44100 Hz, mono)
- Opus (48000 Hz, mono)
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Text to be converted to speech
References to be used for the speech, this requires MessagePack serialization, this will override reference_voices and reference_texts
ID of the reference model o be used for the speech
Chunk length to be used for the speech
Whether to normalize the speech, this will reduce the latency but may reduce performance on numbers and dates
Format to be used for the speech
wav
, pcm
, mp3
, opus
MP3 Bitrate to be used for the speech
64
, 128
, 192
Opus Bitrate to be used for the speech
-1000
, 24
, 32
, 48
, 64
Latency to be used for the speech, balanced will reduce the latency but may lead to performance degradation
normal
, balanced