Text to Speech
This endpoint only accepts application/json
and application/msgpack
.
For best results, upload reference audio using the create model before using this one. This improves speech quality and reduces latency.
To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the instructions.
Audio formats supported:
- WAV / PCM (16-bit, 44100 Hz, mono)
- MP3 (44100 Hz, mono)
- Opus (48000 Hz, mono)
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Text to be converted to speech
References to be used for the speech, this requires MessagePack serialization, this will override reference_voices and reference_texts
ID of the reference model o be used for the speech
Chunk length to be used for the speech
100 < x < 300
Whether to normalize the speech, this will reduce the latency but may reduce performance on numbers and dates
Format to be used for the speech
wav
, pcm
, mp3
, opus
MP3 Bitrate to be used for the speech
64
, 128
, 192
Opus Bitrate to be used for the speech
-1000
, 24
, 32
, 48
, 64
Latency to be used for the speech, balanced will reduce the latency but may lead to performance degradation
normal
, balanced