Audio Quality Guidelines

  • Single speaker only
  • Steady volume, tone, and emotion
  • Brief pauses (0.5s recommended)
  • Ideally: No background noise
  • Ideally: Professional recording quality
  • Ideally: No room echo

Instant Voice Cloning (Playground)

  • 30-45 seconds of quality audio
  • Best: 2-3 15-20s clips forming a complete paragraph

Premium Voice Cloning (Let’s Talk)

  • 30-180 minutes of high-quality audio
  • Optional: Multiple languages and emotions

File Formats

  • Various audio types accepted
  • Recommended: MP3 at 192kbps+ to avoid quality loss
  • Uncompressed formats (e.g., WAV) offer minimal benefit