Text to Speech
Voice Cloning Best Practices
Tips for optimal audio samples in voice cloning.
Audio Quality Guidelines
- Single speaker only
- Steady volume, tone, and emotion
- Brief pauses (0.5s recommended)
- Ideally: No background noise
- Ideally: Professional recording quality
- Ideally: No room echo
Instant Voice Cloning (Playground)
- 30-45 seconds of quality audio
- Best: 2-3 15-20s clips forming a complete paragraph
Premium Voice Cloning (Let’s Talk)
- 30-180 minutes of high-quality audio
- Optional: Multiple languages and emotions
File Formats
- Various audio types accepted
- Recommended: MP3 at 192kbps+ to avoid quality loss
- Uncompressed formats (e.g., WAV) offer minimal benefit