For better speech recognition quality, you can specify the language of the audio input. If not specified, the system will attempt to automatically detect the language.

Using the Fish Audio SDK

First, make sure you have the Fish Audio SDK installed. You can install it from GitHub or PyPI.

Example Usage

This example demonstrates three ways to use the Speech-to-Text API:

  1. Without specifying a language: The system will attempt to auto-detect the language.
  2. Specifying the language: You can provide the language code (e.g., “en” for English) for potentially better recognition.
  3. With precise timestamps: By setting ignore_timestamps=False, you can get more accurate timing information for each segment. Note that this may increase latency for short audio files.

The ignore_timestamps parameter is set to True by default. This reduces latency for short audio

Raw API Usage

If you prefer to use the raw API instead of the SDK, you can use the MessagePack API as described below.

Endpoint Details

Example Usage

This example shows how to use the raw API with MessagePack serialization. You can also use multipart/form-data by changing the Content-Type header and adjusting the request data format accordingly.

Make sure to replace "YOUR_API_KEY" with your actual API key, and adjust the file paths as needed.