POST
/
v1
/
asr
Speech to Text
curl --request POST \
  --url https://api.fish.audio/v1/asr \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'language=<string>' \
  --form ignore_timestamps=true \
  --form audio=@example-file
{
  "text": "<string>",
  "duration": 123,
  "segments": [
    {
      "text": "<string>",
      "start": 123,
      "end": 123
    }
  ]
}
This BETA endpoint only accepts application/form-data and application/msgpack.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

audio
file
required

Audio to be converted to text

language
string | null

Language to be used for the speech

ignore_timestamps
boolean
default:true

Whether to return precise timestamps in the text, this will increase the latency in audio shorter than 30 seconds

Response

Request fulfilled, document follows

text
string
required
duration
number
required

Duration of the audio in seconds

segments
ASRSegment · object[]
required