Convert text to natural-sounding speech
For better speech quality and lower latency, upload reference audio via the create model endpoint. This method uses the Fish Audio SDK and provides a more streamlined approach.
First, make sure you have the Fish Audio SDK installed. You can install it from GitHub or PyPI.
This example demonstrates three ways to use the Text-to-Speech API:
Using a reference_id
: This option uses a model that you’ve previously uploaded or chosen from the playground. Replace "MODEL_ID_UPLOADED_OR_CHOSEN_FROM_PLAYGROUND"
with the actual model ID.
Using reference audio: This option allows you to provide a reference audio file and its corresponding text directly in the request.
Using a specific TTS model: You can specify which model to use with the backend
parameter when calling the tts
method. Available options include:
"speech-1.5"
"speech-1.6"
"s1"
Make sure to replace "your_api_key"
with your actual API key, and adjust the file paths as needed.
If you prefer to use the raw API instead of the SDK, you can still use the MessagePack API as described below.
authorization
: Bearer token authentication with your API keycontent-type
: application/msgpackmodel
(optional): Specify which TTS model to use. Available options include:
speech-1.5
speech-1.6
s1