The Beijing Academy of Artificial Intelligence (BAAI) along with the Chinese government has developed the first virtual student named Hua Zhibing with WuDao 2.0. She (virtual student) began her education at Tsinghua University in Beijing, China.
WuDao 2.0 is one of the most prominent language AI models that comes with 1.75 trillion parameters, beating GPT-3 and Google’s size Switch Transformer. “WuDao 2.0 aims to enable ‘machines’ to think like ‘humans’ and achieve cognitive abilities beyond the Turing test,” said Tang Jie, the lead researcher behind Wu Dao 2.0.
The virtual student model will learn faster than the average rate from the virtual student’s processing capabilities. As a result, the model’s learning levels should improve from that of a 6-year-old to a 12-year-old in a year.
WuDao 2.0 is a pre-trained AI model that assists with simulations of conversational speech, understands pictures, writes poems, and even can generate recipes. The language model was trained with a training system similar to Google’s Mixture of Experts called FastMoE, a Fast Mixture-of-Expert (MoE). FastMoE is an open-source system based on Facebook’s open-source framework, PyTorch, along with available accelerators. FastMoE provides the hierarchical interface to generate the flexible model design quickly and adapt to various applications; it also supports large-scale parallel training.
WuDao 2.0 has an added advantage over GPT-3: it can operate in both Chinese and English. This level of robustness was acquired from training with 4.9 terabytes of texts and images that include 1.2 terabytes of Chinese and 1.2 terabytes of English texts, and 2.5 terabytes of Chinese graphic data.
The researchers also say that the next generation Wu Dao model will predict complex models like 3D structures of proteins, similar to that of DeepMind’s AlphaFold.