Microsoft announces new text-to-speech AI model called VALL-E

VALL-E can synthesize audio of that person saying anything and do it in a way that attempts to preserve the speaker's emotional tone.

By Sahil Pawar

January 10, 2023

Microsoft announced a new text-to-speech AI model called VALL-E on Thursday that can simulate a person’s voice closely when given a three-second audio sample.

Once the model learns a specific voice, it can synthesize audio of that same person saying anything and preserves the speaker’s emotional tone. Its creators claim that VALL-E could be used for high-quality text-to-speech applications and audio content creation when brought together with other generative AI models like GPT-3.

Microsoft describes VALL-E as a neural codec language model which builds on a technology called EnCodec. Unlike other text-to-speech methods that usually synthesize speech by manipulating waveforms, VALL-E creates discrete audio codec codes from text and acoustic prompts using EnCodec.

It basically processes how a person sounds, breaks that information into discrete components called tokens, and uses training data to know how that voice would sound in other phrases outside of the three-second sample.

Microsoft trained VALL-E’s speech synthesis system on an audio library called LibriLight, which Meta assembled. It contains about 60,000 hours of English language speeches from over 7,000 speakers, mainly from LibriVox public domain audiobooks. For VALL-E to create a good result, the voice in the sample must closely match a voice in the training data.

Microsoft announces new text-to-speech AI model called VALL-E

LEAVE A REPLY Cancel reply

Most Popular

AI Didn’t Kill Writing. It Killed Coding. Here’s Why That Was Always Inevitable.

Google Just Made AI Free on Your Phone — No Internet Needed

The AI Boom Is Real — and TSMC’s Record $35.7 Billion Quarter Is the Proof

Anthropic Claude Mythos Preview Finds Thousands of Zero-Day Vulnerabilities, Launches Project Glasswing

Microsoft announces new text-to-speech AI model called VALL-E

Subscribe to our newsletter

RELATED ARTICLES

Anthropic Claude Mythos Preview Finds Thousands of Zero-Day Vulnerabilities, Launches Project Glasswing

Google Just Made AI Free on Your Phone — No Internet Needed

Andrej Karpathy’s LLM Knowledge Base: How AI Is Replacing Personal Note-Taking

LEAVE A REPLY Cancel reply

Most Popular

AI Didn’t Kill Writing. It Killed Coding. Here’s Why That Was Always Inevitable.

Google Just Made AI Free on Your Phone — No Internet Needed

The AI Boom Is Real — and TSMC’s Record $35.7 Billion Quarter Is the Proof

Anthropic Claude Mythos Preview Finds Thousands of Zero-Day Vulnerabilities, Launches Project Glasswing