Meta releases the SpeechMatrix Dataset, which provides a vast collection of parallel (multilingual speech-to-speech) speech elements mined from VoxPopuli in seventeen languages while enabling researchers to generate individual speech-to-speech (S2S) systems.
Using Hokkien, the S2S system was developed under Meta’s Universal Speech Translator (UST) project. Hokkien, one of Taiwan’s official languages, is extensively spoken in the Chinese diaspora but does not have a standard written form. The company stated that Meta’s AI researchers developed translation tools for this language.
Meta said AI translation had been around for the past few years, mainly for written languages. However, more than 40% of 7,000+ languages exist orally and do not have a written standard.
The company wrote, “We plan to use our Hokkien translation system as part of a universal speech translator and will open source our model, code, and training data for the AI community to enable other researchers to build on this work.”
Hokkien speakers can now communicate with English speakers using Meta’s latest S2S translation technology. More than 8,000 hours of Hokkien speech have been mined, along with the appropriate English translations, Meta claimed, adding that the technology may be applied to other unwritten languages and eventually would function in real-time.
Even though the model is currently under development and can only translate one entire sentence simultaneously, Meta said, “It is a step towards a future where simultaneous translation between languages is achievable.”