Sunday, July 21, 2024
HomeNewsMeta Releases the SpeechMatrix Dataset for Speech-to-Speech Translation

Meta Releases the SpeechMatrix Dataset for Speech-to-Speech Translation

Meta releases the SpeechMatrix Dataset, which provides a vast collection of parallel (multilingual speech-to-speech) speech elements mined from VoxPopuli in seventeen languages while enabling researchers to generate individual speech-to-speech (S2S) systems.

Using Hokkien, the S2S system was developed under Meta’s Universal Speech Translator (UST) project. Hokkien, one of Taiwan’s official languages, is extensively spoken in the Chinese diaspora but does not have a standard written form. The company stated that Meta’s AI researchers developed translation tools for this language.

Meta said AI translation had been around for the past few years, mainly for written languages. However, more than 40% of 7,000+ languages exist orally and do not have a written standard. 

Read More: Meta AI’s New AI Model can Translate 200 Languages with Enhanced Quality.

The company wrote, “We plan to use our Hokkien translation system as part of a universal speech translator and will open source our model, code, and training data for the AI community to enable other researchers to build on this work.”

Hokkien speakers can now communicate with English speakers using Meta’s latest S2S translation technology. More than 8,000 hours of Hokkien speech have been mined, along with the appropriate English translations, Meta claimed, adding that the technology may be applied to other unwritten languages and eventually would function in real-time.

Even though the model is currently under development and can only translate one entire sentence simultaneously, Meta said, “It is a step towards a future where simultaneous translation between languages is achievable.”

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Disha Chopra
Disha Chopra
Disha Chopra is a content enthusiast! She is an Economics graduate pursuing her PG in the same field along with Data Sciences. Disha enjoys the ever-demanding world of content and the flexibility that comes with it. She can be found listening to music or simply asleep when not working!


Please enter your comment!
Please enter your name here

Most Popular