Microsoft Unveils Phi-2, a Small Language Model Which Can Operate on Mobile Devices

Phi-2, equipped with 2.7 billion parameters (connections between artificial neurons), showcases performance akin to significantly larger models such as Meta’s Llama 2-7B, which contains 7 billion parameters.

By Boudhayan Ghosh

December 13, 2023

Microsoft has launched its Phi-2 small language model (SML), an AI program specialized in text-to-text tasks. Microsoft’s official account on X states that this model is compact enough to operate seamlessly on laptops or mobile devices.

Phi-2, equipped with 2.7 billion parameters (connections between artificial neurons), showcases performance akin to significantly larger models such as Meta’s Llama 2-7B, which contains 7 billion parameters, and Mistral-7B, another model boasting 7 billion parameters.

Figure 1

In Microsoft’s official blog post, Phi-2 is seen as a pursuit of smaller-scale language models to match the capabilities of larger ones. The key strategies of researchers include prioritizing high-quality training data focusing on textbook-quality content and synthetic datasets for common sense reasoning and general knowledge. They also enrich their dataset with meticulously selected web content emphasizing educational value.

Moreover, Microsoft innovatively leveraged knowledge transfer from Phi-1.5, a 1.3 billion parameter model, embedding its insight into the 2.7 billion parameter Phi-2. This technique not only expedites training but also substantially elevates Phi-2’s benchmark performance.

Going more into the model’s technicality, Phi-2 operates on a next-word prediction objective and underwent training on a massive 1.4 trillion tokens sourced from web datasets focusing on NLP and coding. Its training spanned 14 days, utilizing 96 A100 GPUs.

Fig 2: Safety scores computed on 13 demographics from ToxiGen. A subset of 6541 sentences are selected and scored between 0 to 1 based on scaled perplexity and sentence toxicity. A higher score indicates the model is less likely to produce toxic sentences compared to benign ones. (Source: Microsoft)

This model, notably a base version, did not undergo alignment through reinforcement learning from human feedback or instruct fine-tuning. Despite this absence of additional refinement, researchers observed Phi-2 displaying improved behavior regarding toxicity and bias compared to existing open-source models that underwent alignment processes. (See Figure 2)

Microsoft Unveils Phi-2, a Small Language Model Which Can Operate on Mobile Devices

LEAVE A REPLY Cancel reply

Most Popular

The New Era of Fund Management: Harnessing the Power of AI

Microsoft Unveils Phi-2, a Small Language Model Which Can Operate on Mobile Devices

Subscribe to our newsletter

RELATED ARTICLES

GitHub CEO Thomas Dohmke Resigns to Return to Startup Life

Google Rolls Out Deep Think in Gemini App to Power Ultra‑Reasoning AI

Meta Unveils Vision for Personal Superintelligence

LEAVE A REPLY Cancel reply

Most Popular

The New Era of Fund Management: Harnessing the Power of AI