Documentation Index
Fetch the complete documentation index at: https://docs.fish.audio/llms.txt
Use this file to discover all available pages before exploring further.
Purpose
This page is the recommended starting point for AI agents, RAG pipelines, and documentation crawlers that need accurate Fish Audio references with minimal markup noise.Built-In Agent Indexes
This documentation site already provides built-in LLM-friendly indexes:- llms.txt for the curated documentation index
- llms-full.txt for broader site context
llms.txt first and only fetch llms-full.txt when they need wider context across the whole documentation set.
Install the Agent Skill
For coding agents that support Agent Skills (Claude Code, Cursor, Windsurf, Codex, and others), install the ready-made raw-API skill with a single command:curl, Python, Node.js, or any HTTP client — no SDK required. It covers authentication, every endpoint in our OpenAPI schema, MessagePack vs JSON vs multipart encoding rules, multi-speaker dialogue, and the WebSocket streaming protocol.
Discovery endpoint: /.well-known/agent-skills/index.json. Run npx skills add https://docs.fish.audio (without --skill) to install every skill published here, including the auto-generated product overview skill.
Retrieval Order
- Read llms.txt for the curated documentation index.
- Read llms-full.txt when broad site context is needed.
- Read OpenAPI for REST schemas, parameters, and examples.
- Read AsyncAPI for the WebSocket streaming protocol.
- Fetch individual
.mdpages only after narrowing to a specific task.
Canonical API Facts
- Base API URL:
https://api.fish.audio - Authentication:
Authorization: Bearer <FISH_API_KEY> - TTS model selection: send a required
modelheader. Recommended default:s2-pro - Main REST endpoints:
POST /v1/ttsPOST /v1/asrGET /modelPOST /modelGET /model/{id}PATCH /model/{id}DELETE /model/{id}
- Real-time streaming endpoint:
wss://api.fish.audio/v1/tts/live
High-Value URLs
Start Here
API Specs
Authentication And SDK Setup
Core Product Tasks
- Text to Speech Guide
- Speech to Text Guide
- Creating Voice Models
- Emotion Control
- Fine-grained Control
Real-Time And Integrations
- WebSocket TTS Streaming
- Real-time Voice Streaming Best Practices
- Python WebSocket Streaming
- JavaScript WebSocket
- LiveKit Integration
- Pipecat Integration
Models, Pricing, And Lifecycle
Task Routing
- If the task is “generate speech”, start with Quick Start, the Text to Speech guide, and
POST /v1/tts. - If the task is “transcribe audio”, start with the Speech to Text guide and
POST /v1/asr. - If the task is “clone or manage voices”, start with Creating Voice Models and the
/modelendpoints. - If the task is “stream audio in real time”, start with AsyncAPI, WebSocket TTS Streaming, and the WebSocket SDK guides.
- If the task is “pick the right model or estimate cost”, start with Models Overview and Pricing And Rate Limits.
Notes For Agents
- Prefer
openapi.jsonandasyncapi.ymlfor machine-readable schemas. - Prefer
.mdURLs when you need a single human-authored page in Markdown form. - Some richer pages use interactive MDX widgets. If a fetched page contains UI or component noise, fall back to this page,
llms.txt,llms-full.txt, or the API spec files first. - Treat this page as the canonical low-noise entry point for Fish Audio documentation retrieval.




