ElevenLabs TTS Turbo V2.5
High-speed text-to-speech audio generation using ElevenLabs' Turbo v2.5 model. Customizable voices and speech attributes.
High-speed text-to-speech audio generation using ElevenLabs' Turbo v2.5 model. Customizable voices and speech attributes.
Model Overview
A fast text-to-speech model that converts input text into natural-sounding speech with customizable voice settings, speed, and stability.
Best At
Fast, high-quality speech synthesis. Ideal for real-time applications and quick content creation when speed is critical. Allows fine-tuning of voice characteristics and speed.
Limitations / Not Good At
Some voice parameters require careful tuning to avoid unnatural output. Long-form content may need segmentation to maintain coherence.
Ideal Use Cases
- Audiobooks or narration where speed is important.
- Real-time virtual assistants.
- Custom voiceovers for marketing videos.
- E-learning content.
Input & Output Format
Input: Text prompt (string). Output: Audio file (MP3) accessed via a URL, with optional word-level timestamps.
Performance Notes
Extremely fast generation (high-speed), suitable for real-time applications. Supports batch processing with optional continuity parameters.
Text
StringThe text to convert to speech
Previous Text (Optional)
StringThe text that came before the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation.
Next Text (Optional)
StringThe text that comes after the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation.
Text
StringThe text to convert to speech
Next Text
StringThe text that comes after the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation.
Speed
NumberSpeech speed (0.7-1.2). Values below 1.0 slow down the speech, above 1.0 speed it up. Extreme values may affect quality.
1Style
NumberStyle exaggeration (0-1): Amplifies the distinctive speaking style of the original voice. It adds extra effort and latency, and can make the output slightly less stable, so it’s best kept at 0 unless a dramatic effect is needed.
0Stability
NumberVoice stability (0-1): Controls how consistent the voice is. Lower values give a wider emotional range and more varied pacing, but can sound erratic. Higher values produce a steadier, more monotone delivery that usually requires fewer iterations to hit the desired tone.
0.5Similarity Boost
NumberSimilarity boost (0-1): The similarity slider dictates how closely the AI should adhere to the original voice when attempting to replicate it. If the original audio is of poor quality and the similarity slider is set too high, the AI may reproduce artifacts or background noise when trying to mimic the voice if those were present in the original recording.
0.75Voice
StringThe voice to use for speech generation
21m00Tcm4TlvDq8ikWAMLanguage Code
StringLanguage code (ISO 639-1) used to enforce a language for the model. Currently only Turbo v2.5 and Flash v2.5 support language enforcement. For other models, an error will be returned if language code is provided.
Previous Text
StringThe text that came before the text of the current request. Can be used to improve the speech's continuity when concatenating together multiple generations or to influence the speech's continuity in the current generation.
Voice Control
StringAdvanced
StringOutput
InferredOutput
Type
Node
Status
Official
Package
Nodespell AI
Category
AI / Audio / ElevenlabsInput
Output