Overview
Transform any text into natural, expressive speech using Fish Audio’s advanced TTS models. Choose from pre-made voices or use your own cloned voices.Quick Start
Web Interface
The easiest way to generate speech:1
Visit Playground
Go to fish.audio and log in
2
Enter Your Text
Type or paste the text you want to convert
3
Choose a Voice
Select from available voices or use your own
4
Generate
Click “Generate” and download your audio
Using the SDK
Installation
Install the Fish Audio SDK:Basic Usage
Generate speech with just a few lines of code:Voice Options
Using Pre-made Voices
Browse and select voices from the playground:Using Your Cloned Voice
Use voices you’ve created:Using Reference Audio
Provide reference audio directly:Model Selection
Choose the right model for your needs:Model | Best For | Quality | Speed |
---|---|---|---|
s1 | Latest features | Excellent | Fast |
speech-1.6 | Stable production | Very Good | Fast |
speech-1.5 | Legacy support | Good | Fastest |
Advanced Options
Audio Formats
Choose your output format:Chunk Length
Control text processing chunks:Latency Mode
Optimize for speed or quality:Balanced mode reduces latency to ~300ms but may slightly decrease stability.
Direct API Usage
For direct API calls without the SDK:Streaming Audio
Stream audio for real-time applications:Adding Emotions
Make your speech more expressive:- Basic:
(happy)
,(sad)
,(angry)
,(excited)
,(calm)
- Tones:
(shouting)
,(whispering)
,(soft tone)
- Effects:
(laughing)
,(sighing)
,(crying)
Best Practices
Text Preparation
Do:- Use proper punctuation for natural pauses
- Add emotion markers for expression
- Break long texts into paragraphs
- Use consistent formatting
- Use ALL CAPS (unless shouting)
- Mix multiple languages randomly
- Include special characters unnecessarily
- Forget punctuation
Performance Tips
- Batch Processing: Process multiple texts efficiently
- Cache Models: Store frequently used model IDs
- Optimize Chunk Size: Use 200 characters for best balance
- Handle Errors: Implement retry logic for network issues
Quality Optimization
For best results:- Use high-quality reference audio for cloning
- Choose appropriate emotion markers
- Test different latency modes
- Monitor API rate limits
Troubleshooting
Common Issues
No audio output:- Check API key validity
- Verify model ID exists
- Ensure proper audio format
- Use better reference audio
- Try normal latency mode
- Check text formatting
- Use balanced latency mode
- Reduce chunk length
- Check network connection
Code Examples
Batch Processing
Error Handling
API Reference
Request Parameters
Parameter | Type | Description | Default |
---|---|---|---|
text | string | Text to convert | Required |
reference_id | string | Model/voice ID | None |
format | string | Audio format | ”mp3” |
chunk_length | integer | Characters per chunk | 200 |
normalize | boolean | Normalize text | true |
latency | string | Speed vs quality | ”normal” |
Response
Returns audio data in the specified format as binary stream.Get Support
Need help with text-to-speech?- API Documentation: Developer Docs
- Discord Community: Join our Discord
- Email Support: support@fish.audio