Mistral AI Unveils Voxtral TTS: A Revolutionary Open-Source Voice Model

French AI pioneer, Mistral AI, has unveiled Voxtral TTS, a revolutionary open-source text-to-speech model. This game-changing technology is set to rival industry heavyweights such as ElevenLabs and OpenAI. Launched on March 26, 2026, this compact 4-billion parameter model offers support for nine languages, including English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic.

The key innovation of the model lies in its remarkable efficiency and accessibility. Voxtral TTS has the ability to adapt to custom voices using just 3 seconds of reference audio. It also boasts incredibly low latency, with a time-to-first-audio of 90 milliseconds. The model achieves a real-time factor of 6x, meaning it can generate a 10-second audio clip in approximately 1.6 seconds.

Unlike its proprietary competitors, Mistral is releasing the full model weights under a Creative Commons license. This allows enterprises to operate it on their own hardware, ranging from smartphones to data centers, without the need to send sensitive voice data to third parties. This move addresses the growing concerns about data sovereignty, especially in Europe. Pierre Stock, Mistral’s VP of Science, emphasized that the model was designed to sound “super natural and conversational” as opposed to robotic.

The model is now available via API at $0.016 per 1,000 characters. Developers can access it through Mistral Studio, Le Chat, or download it from Hugging Face for local deployment.

Source: TechCrunch

Move to the category:

Leave a Reply

Your email address will not be published. Required fields are marked *