Use two-way websocket to get real-time TTS audio
reference_id
: This option uses a model that you’ve previously uploaded or chosen from the playground. Replace "MODEL_ID_UPLOADED_OR_CHOSEN_FROM_PLAYGROUND"
with the actual model ID.
backend
parameter when calling the tts
method. Available options include:
"speech-1.5"
(default)"speech-1.6"
"s1"
temperature
(default: 0.7): Controls randomness in the speech generation. Higher values (e.g., 1.0) make the output more random, while lower values (e.g., 0.1) make it more deterministic.top_p
(default: 0.7): Controls diversity via nucleus sampling. Lower values (e.g., 0.1) make the output more focused, while higher values (e.g., 1.0) allow more diversity."your_api_key"
with your actual API key, and adjust the file paths as needed.
wss://api.fish.audio/v1/tts/live
Authorization
: Bearer token authentication with your API keymodel
(optional): Specify which TTS model to use. Available options include:
speech-1.5
(default)speech-1.6
s1
start
- Initializes the TTS session:
text
- Sends text chunks:
audio
- Receives audio data (server response):
stop
- Ends the session:
flush
- Flushes the text buffer:
This immediately generates the audio and returns it, if text is too short, it may lead to under-quality audio.
finish
- Ends the session (server side):
log
- Logs messages from the server if debug is true:
apt-get install mpv
brew install mpv