Overview
Fish Audio models support 64+ emotional expressions and voice styles that can be controlled through text markers in your input. Add natural pauses, laughter, and other human-like elements to make speech more engaging and realistic.How It Works
Simply wrap emotion tags in parentheses within your text:Complete Emotion Reference
Basic Emotions (24 expressions)
Advanced Emotions (25 expressions)
Tone Markers (5 expressions)
Control volume and intensity:Audio Effects (10 expressions)
Add natural human sounds:Special Effects
Additional markers for atmosphere and context:Usage Guidelines
Placement Rules
For English and Most Languages:- Emotion tags MUST go at the beginning of sentences
- Tone controls can go anywhere in the text
- Sound effects can go anywhere in the text
Combining Effects
You can layer multiple emotions for complex expressions:Sequential Emotions
Change emotions throughout your text:Advanced Techniques
Emotion Transitions
Create natural emotional progressions:Background Effects
Add atmospheric sounds:Intensity Modifiers
Fine-tune emotional intensity with descriptive modifiers:Model Capabilities
Feature | Fish Speech 1.5 | OpenAudio Dev | OpenAudio Pro |
---|---|---|---|
Basic Emotions | 24 | 24 | 24 |
Advanced Emotions | Limited | 25 | 25 |
Tone Markers | 5 | 5 | 5 |
Audio Effects | 6 | 10 | 10 |
Intensity Modifiers | No | Yes | Yes |
Background Effects | No | Yes | Yes |
Language Support
All 30+ supported languages can use emotion markers:- English, Spanish, French, German: Emotions must be at sentence start
- Chinese, Japanese, Korean: More flexible placement allowed
- Arabic, Hebrew: Right-to-left text considerations apply
Best Practices
Do’s
- Use one primary emotion per sentence
- Test different emotion combinations
- Match emotions to context logically
- Add appropriate text after sound effects (e.g., “Ha ha” after laughing)
- Use natural expressions when possible
- Space out emotional changes for realism
Don’ts
- Don’t overuse emotion tags in short text
- Don’t mix conflicting emotions
- Don’t create custom tags - use only supported ones
- Don’t forget parentheses
- Don’t place emotion tags mid-sentence in English
Common Use Cases
Customer Service
Storytelling
Educational Content
Marketing & Sales
Troubleshooting
Emotion Not Working?
- Check placement - Emotions must be at the beginning of sentences for English
- Verify spelling - Tags must match exactly as listed
- Include parentheses - Tags must be wrapped in parentheses
- Confirm model support - Check the model capabilities table
Unnatural Sound?
- Space out emotional changes
- Use appropriate intensity
- Test with different voices
- Add context text after sound effects
Performance Notes
- Emotion markers don’t count toward token limits
- No additional latency for emotion processing
- All emotions available on all pricing tiers
- Maximum of 3 combined emotions per sentence recommended
Quick Reference Tables
Emotion Intensity Scale
Base Emotion | Mild | Moderate | Intense |
---|---|---|---|
Happy | satisfied | happy | delighted |
Sad | disappointed | sad | depressed |
Angry | frustrated | angry | furious |
Scared | nervous | scared | terrified |
Excited | interested | excited | ecstatic |
Common Combinations
Scenario | Emotion Combo | Example |
---|---|---|
Whispered Secret | (mysterious)(whispering) | “I have something to tell you…” |
Angry Shout | (angry)(shouting) | “Stop right there!” |
Sad Sigh | (sad)(sighing) | “I wish things were different. Sigh.” |
Excited Laugh | (excited)(laughing) | “We did it! Ha ha!” |
Nervous Question | (nervous)(uncertain) | “Are you sure about this?” |
See Also
- Emotion Reference Guide - Complete emotion list with examples
- API Reference - Implementation details
- Text-to-Speech Best Practices