A team of engineers, scientists, and a semiconductor manufacturer from the Bay Area worked together to create a powerful Arabic language model Jais that can power applications for generative AI.
With 13 billion parameters, the new large language model Jais was created from a large collection of data mixing Arabic and English, some of which came via computer code. There are not many bilingual large language models, according to the group of academics and engineers who started the research, and this served as the purpose for the model.
The new language model was developed using supercomputers made by Silicon Valley-based Cerebras Systems, which makes chips the size of dinner plates that compete with Nvidia’s potent AI hardware. Businesses all around the world are looking for alternatives because Nvidia’s processors are in short supply due to China restrictions.
Read More: UK to Invest £100m in AI Chips Production Amid Global Competition
An open source license will be used to make Jais accessible. The group trained the Jais model on a Condor Galaxy supercomputer owned by Cerebras. Cerebras stated this year that it had sold three of these units to G42, the first of which is due this year and the remaining two in 2024.
According to Timothy Baldwin, a professor at Mohamed bin Zayed University of Artificial Intelligence, there isn’t enough Arabic data to train a model the size of Jais, thus the computer code found in the English language data helped train the model’s reasoning capabilities.
Jais, which takes its name from the highest mountain in the UAE, is the result of a partnership between the Mohamed bin Zayed University of Artificial Intelligence, Cerebras, and the AI-focused subsidiary Inception of the G42 technology conglomerate, based in Abu Dhabi.