London researchers develop a new visual speech recognition model

November 28, 2022

Researchers from Imperial College, London, develop a new deep learning model to detect visual speech recognition tasks in multiple languages. The new model outperforms some previously proposed models trained on larger datasets.

A Ph.D. graduate, Pingchuan Ma, from Imperial College London, with his friends, realized that many visual speech recognition projects only deal with the English language. His objective was to recognize speech in other languages except for the English language from the lip movements of speakers and then compare it with other models trained to recognize English speech.

The new model created by the Imperial College researchers is similar to the speech recognition models in the past. But, some of its hyperparameters are optimized, the dataset is augmented, and additional loss functions are used. Therefore, by carefully designing the new deep-learning model, Ma and his colleagues achieved state-of-the-art performances in visual speech recognition.

However, some of the deep learning models in the past have achieved greater performance on visual speech recognition tasks. Still, they were primarily trained to detect speech in English, as most existing training datasets only include English speech. This limits users to work in English-speaking contexts only. But the new deep learning model by the London researchers can perform in many languages and allow users to work on many speeches.

In the future, the new model can inspire other researchers to develop alternative visual speech recognition models to recognize speech from the lip movements of speakers effectively.

London researchers develop a new visual speech recognition model

LEAVE A REPLY Cancel reply

Most Popular

Google Rolls Out Deep Think in Gemini App to Power Ultra‑Reasoning AI

Meta Unveils Vision for Personal Superintelligence

London researchers develop a new visual speech recognition model

Subscribe to our newsletter

RELATED ARTICLES

Google Rolls Out Deep Think in Gemini App to Power Ultra‑Reasoning AI

Meta Unveils Vision for Personal Superintelligence

Google NotebookLM Video Overviews Launch Turns Research into AI‑Powered Explainer Videos

LEAVE A REPLY Cancel reply

Most Popular

Google Rolls Out Deep Think in Gemini App to Power Ultra‑Reasoning AI

Meta Unveils Vision for Personal Superintelligence