Text to Speech
Convert text to natural-sounding speech
For better speech quality and lower latency, upload reference audio via the create model endpoint. This method uses the Fish Audio SDK and provides a more streamlined approach.
Using the Fish Audio SDK
First, make sure you have the Fish Audio SDK installed. You can install it from GitHub or PyPI.
Example Usage
This example demonstrates three ways to use the Text-to-Speech API:
-
Using a
reference_id
: This option uses a model that you’ve previously uploaded or chosen from the playground. Replace"MODEL_ID_UPLOADED_OR_CHOSEN_FROM_PLAYGROUND"
with the actual model ID. -
Using reference audio: This option allows you to provide a reference audio file and its corresponding text directly in the request.
-
Using a specific TTS model: You can specify which model to use with the
backend
parameter when calling thetts
method. Available options include:"speech-1.5"
"speech-1.6"
Make sure to replace "your_api_key"
with your actual API key, and adjust the file paths as needed.
Raw API Usage
If you prefer to use the raw API instead of the SDK, you can still use the MessagePack API as described below.
Endpoint Details
- Method: POST
- URL: https://api.fish.audio/v1/tts
- Content-Type: application/msgpack
Headers
authorization
: Bearer token authentication with your API keycontent-type
: application/msgpackmodel
(optional): Specify which TTS model to use. Available options include:speech-1.5
speech-1.6