Google has launched a new voice transfer module for its text-to-speech systems. This module aims to help people who have lost their voices or have unique speech patterns. It works by restoring their original voice, making communication easier.
People lose their voices due to conditions such as ALS, muscular dystrophy, or any hereditary diseases. Losing one’s voice can impact one’s identity. Google’s technology aims to bring back that vital part of one’s identity.
The system works with either few-shot or zero-shot training. Few-shot training adapts the model using samples from the speaker’s past voice recordings. On the other hand, zero-shot training uses short audio samples, even if the person has never had a typical voice. It makes zero-shot training ideal for those who have never recorded a speech.
Read More: Google DeepMind Welcomes 2 Billion Parameter Gemma 2 Model
One of the key strengths of Google’s VT module is its seamless integration with existing TTS systems. It can be easily plugged into these systems to restore voices from small speech samples, whether typical or atypical. This multilingual technology can transfer voices across different languages, making it versatile and applicable in various fields.
With such powerful technology, there are security measures to prevent its misuse. Google has incorporated audio watermarking into the system. This technique embeds hidden information within the synthesized audio, allowing you to detect the unauthorized use of voice transfer technology.
Google’s zero-shot voice transfer module represents a significant leap forward in personalized voice technology. It allows people with speech impairments to communicate more effectively, opening up new possibilities.