Sunday, April 28, 2024
ad
HomeNewsMicrosoft Unveils Phi-2, a Small Language Model Which Can Operate on Mobile...

Microsoft Unveils Phi-2, a Small Language Model Which Can Operate on Mobile Devices

Phi-2, equipped with 2.7 billion parameters (connections between artificial neurons), showcases performance akin to significantly larger models such as Meta’s Llama 2-7B, which contains 7 billion parameters.

Microsoft has launched its Phi-2 small language model (SML), an AI program specialized in text-to-text tasks. Microsoft’s official account on X states that this model is compact enough to operate seamlessly on laptops or mobile devices. 

Phi-2, equipped with 2.7 billion parameters (connections between artificial neurons), showcases performance akin to significantly larger models such as Meta’s Llama 2-7B, which contains 7 billion parameters, and Mistral-7B, another model boasting 7 billion parameters. 

Figure 1

In Microsoft’s official blog post, Phi-2 is seen as a pursuit of smaller-scale language models to match the capabilities of larger ones. The key strategies of researchers include prioritizing high-quality training data focusing on textbook-quality content and synthetic datasets for common sense reasoning and general knowledge. They also enrich their dataset with meticulously selected web content emphasizing educational value. 

Read More: Ranjani Mani is Microsoft’s New AI Director

Moreover, Microsoft innovatively leveraged knowledge transfer from Phi-1.5, a 1.3 billion parameter model, embedding its insight into the 2.7 billion parameter Phi-2. This technique not only expedites training but also substantially elevates Phi-2’s benchmark performance. 

Going more into the model’s technicality, Phi-2 operates on a next-word prediction objective and underwent training on a massive 1.4 trillion tokens sourced from web datasets focusing on NLP and coding. Its training spanned 14 days, utilizing 96 A100 GPUs. 

Fig 2: Safety scores computed on 13 demographics from ToxiGen. A subset of 6541 sentences are selected and scored between 0 to 1 based on scaled perplexity and sentence toxicity. A higher score indicates the model is less likely to produce toxic sentences compared to benign ones. (Source: Microsoft) 

This model, notably a base version, did not undergo alignment through reinforcement learning from human feedback or instruct fine-tuning. Despite this absence of additional refinement, researchers observed Phi-2 displaying improved behavior regarding toxicity and bias compared to existing open-source models that underwent alignment processes. (See Figure 2)

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Boudhayan Ghosh
Boudhayan Ghosh
I am a Journalism and Communication graduate, currently working with Analytics Drift as an Associate Technology Journalist. My other hobbies include consuming art and watching football.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular