Introducing Phi-2, Microsoft’s Small Language Model

Image source: Analytics Drift

Microsoft has launched its Phi-2 small language model (SML), an AI program specialized in text-to-text tasks.

Image source: Microsoft

Microsoft’s official account on X states that this model is compact enough to operate seamlessly on laptops or mobile devices.

Image source: Canva

Phi-2, equipped with 2.7 billion parameters (connections between artificial neurons), showcases performance akin to significantly larger models such as Meta’s Llama 2-7B, which contains 7 billion parameters, and Mistral-7B, another model boasting 7 billion parameters.

Image source: Canva

In Microsoft's official blog post, Phi-2 is seen as a pursuit of smaller-scale language models to match the capabilities of larger ones.

Image source: Microsoft

The key strategies of researchers include prioritizing high-quality training data, focusing on textbook-quality content and synthetic datasets for common sense reasoning and general knowledge.

Image source: canva

They also enrich their dataset with meticulously selected web content emphasizing educational value.

Image source: canva

Microsoft innovatively leveraged knowledge transfer from Phi-1.5, a 1.3 billion parameter model, embedding its insight into the 2.7 billion parameter Phi-2. This technique not only expedites training but also substantially elevates Phi-2’s benchmark performance.

Image source: Microsoft

Going more into the model’s technicality, Phi-2 operates on a next-word prediction objective and underwent training on a massive 1.4 trillion tokens sourced from web datasets focusing on NLP and coding.

Image source: Github



Follow us on

Produced by: Boudhayan Ghosh Designed by: Prathamesh

Don't Miss Out on the 

Latest in AI and Data Science