Speech to Text
Convert speech to text
For better speech recognition quality, you can specify the language of the audio input. If not specified, the system will attempt to automatically detect the language.
Using the Fish Audio SDK
First, make sure you have the Fish Audio SDK installed. You can install it from GitHub or PyPI.
Example Usage
This example demonstrates three ways to use the Speech-to-Text API:
- Without specifying a language: The system will attempt to auto-detect the language.
- Specifying the language: You can provide the language code (e.g., “en” for English) for potentially better recognition.
- With precise timestamps: By setting
ignore_timestamps=False
, you can get more accurate timing information for each segment. Note that this may increase latency for short audio files.
The ignore_timestamps
parameter is set to True
by default. This reduces latency for short audio
Raw API Usage
If you prefer to use the raw API instead of the SDK, you can use the MessagePack API as described below.
Endpoint Details
- Method: POST
- URL: https://api.fish.audio/v1/asr
- Content-Type: multipart/form-data or application/msgpack
Example Usage
This example shows how to use the raw API with MessagePack serialization. You can also use multipart/form-data by changing the Content-Type
header and adjusting the request data format accordingly.
Make sure to replace "YOUR_API_KEY"
with your actual API key, and adjust the file paths as needed.