Generate expressive, natural speech with Resemble AI's Chatterbox Pro.
Model Overview
Chatterbox Pro is Resemble AI's first production-grade open-source Text-to-Speech (TTS) model. It's designed to generate expressive and natural-sounding speech, outperforming leading closed-source systems in evaluations.
Best At
Generating high-quality, natural-sounding speech for various applications like memes, videos, games, and AI agents. It excels at controlling emotion exaggeration, making voices more distinct and engaging.
Limitations / Not Good At
While stable, extreme values for 'exaggeration' can lead to instability. The model includes built-in watermarking for responsible AI, which might be a consideration for certain use cases.
Ideal Use Cases
- Voiceovers for videos and games 🎬
- Creating dialogue for AI agents and chatbots 🤖
- Generating speech for memes and social media content 😂
- Prototyping TTS for applications
Input & Output Format
Input: Text prompt, optional voice selection, custom voice UUID, exaggeration level, temperature, seed, and pitch.
Output: Audio file (URI).
Performance Notes
Offers ultra-low latency (sub 200ms) for production use. Outputs are watermarked for responsible AI. Provides easy voice conversion scripts.