Overview
Real-time streaming lets you generate speech as you type or speak, perfect for chatbots, virtual assistants, and live applications.When to Use Streaming
Perfect for:- Live chat applications
- Virtual assistants
- Interactive storytelling
- Real-time translations
- Gaming dialogue
- Pre-recorded content
- Batch processing
- When perfect quality is critical
Getting Started
Web Playground
Try real-time streaming instantly:- Visit fish.audio
- Enable “Streaming Mode”
- Start typing and hear voice generation in real-time
Using the SDK
Stream text as it’s being written:Configuration Options
Speed vs Quality
Latency Modes:- Normal: Best quality, ~500ms latency
- Balanced: Good quality, ~300ms latency
Voice Control
Temperature (0.1 - 1.0):- Lower: More consistent, predictable
- Higher: More varied, expressive
- Lower: More focused
- Higher: More diverse
Real-time Applications
Chatbot Integration
Stream responses as they’re generated:Live Translation
Translate and speak simultaneously:Best Practices
Text Buffering
Do:- Send complete words with spaces
- Use punctuation for natural pauses
- Buffer 5-10 words for smoothness
- Send individual characters
- Forget spaces between words
- Send huge chunks at once
Connection Management
- Keep connections alive for multiple generations
- Handle disconnections gracefully
- Implement retry logic for reliability
Audio Playback
For smooth playback:- Buffer 2-3 audio chunks
- Use cross-fading between chunks
- Handle network delays gracefully
Common Use Cases
Interactive Story
Virtual Assistant
Live Commentary
Troubleshooting
Audio Gaps
Problem: Gaps between audio chunks Solution:- Increase buffer size
- Use balanced latency mode
- Check network connection
Delayed Response
Problem: Long wait before audio starts Solution:- Use balanced latency mode
- Send initial text immediately
- Reduce chunk size
Choppy Playback
Problem: Audio cuts in and out Solution:- Buffer more chunks before playing
- Check network stability
- Use consistent chunk sizes
Advanced Features
Dynamic Voice Switching
Change voices mid-stream:Emotion Injection
Add emotions dynamically:Speed Control
Adjust speaking speed:Performance Tips
- Pre-load voices for instant start
- Use connection pooling for multiple streams
- Monitor latency and adjust settings
- Cache common phrases for instant playback
Get Support
Need help with streaming?- Discord Community: Join our Discord
- Email Support: support@fish.audio
- Status Page: status.fish.audio