Prerequisites
Create a Fish Audio account
Create a Fish Audio account
Sign up for a free Fish Audio account to get started with our API.
- Go to fish.audio/auth/signup
- Fill in your details to create an account, complete steps to verify your account.
- Log in to your account and navigate to the API section
Get your API key
Get your API key
Once you have an account, you’ll need an API key to authenticate your requests.
- Log in to your Fish Audio Dashboard
- Navigate to the API Keys section
- Click “Create New Key” and give it a descriptive name, set a expiration if desired
- Copy your key and store it securely
Keep your API key secret! Never commit it to version control or share it publicly.
Get free API credits by verifying your phone number.
Overview
Voice cloning allows you to generate speech that matches a specific voice using reference audio. Fish Audio supports two approaches:- Using pre-trained voice models (reference_id)
- Providing reference audio directly in your request
Use
reference_id
when you’ll reuse a voice multiple times - it’s faster and more efficient. Use references
for one-off voice cloning or testing different voices without creating models.Using Reference Audio
Clone a voice by providing reference audio directly:Multiple References
Improve voice quality by providing multiple reference samples:Creating Voice Models
For repeated use, create a persistent voice model:Best Practices
Audio Quality
For best results, reference audio should:- Be 10-30 seconds long per sample
- Have clear speech without background noise
- Match the language you’ll generate
- Include varied intonation and emotion
Sample Text
The text parameter in ReferenceAudio should:- Match exactly what’s spoken in the audio
- Include punctuation for proper prosody
- Be in the same language as generation
Performance Tips
- Pre-upload models for frequently used voices
- Use 2-3 reference samples for optimal quality
- Keep samples under 30 seconds each
- Normalize audio levels before uploading
Audio Format Requirements
Supported formats for reference audio:- WAV (recommended)
- MP3
- M4A
- Other common audio formats
- 16kHz minimum
- 44.1kHz recommended
- Mono or stereo (converted to mono)