🔊Introducing Voxtral TTS: our new frontier open-weight model for natural, expressive, and ultra-fast text-to-speech
🎭Realistic, emotionally expressive speech.
🌍Supports 9 languages and accurately captures diverse dialects.
⚡Very low latency for time-to-first-audio.
🔄Easily adaptable to new voices
VIDEO
Voxtral TTS is built for global applications supporting 9 languages and powering voice workflows.
✅ Full audio intelligence: Works with Voxtral Transcribe for end-to-end speech-to-speech, or plugs into any STT + LLM stack.
✅ Built for business: From customer support to real-time translation, it’s the output layer that passes the human test.
🎥 See it in action:
✅ Full audio intelligence: Works with Voxtral Transcribe for end-to-end speech-to-speech, or plugs into any STT + LLM stack.
✅ Built for business: From customer support to real-time translation, it’s the output layer that passes the human test.
🎥 See it in action:
VIDEO
State-of-the-art performance.
In zero-shot custom voice tests, Voxtral TTS outperformed ElevenLabs v2.5 Flash - judged by native speakers for naturalness, accent accuracy, and similarity to the original voice.
In zero-shot custom voice tests, Voxtral TTS outperformed ElevenLabs v2.5 Flash - judged by native speakers for naturalness, accent accuracy, and similarity to the original voice.

Experiment with Voxtral TTS directly in the Mistral Studio playground. Select one of the Mistral voices or record your own.
VIDEO
Check out our blog post for details: mistral.ai/news/voxtral-t…
Generated by Thread Navigator
Press ⌘ + S to quick-export
