www.analyticsdrift.com
Image source: Analytics Drift
Microsoft has launched its Phi-2 small language model (SML), an AI program specialized in text-to-text tasks.
Image source: Microsoft
Microsoft’s official account on X states that this model is compact enough to operate seamlessly on laptops or mobile devices.
Image source: Canva
Phi-2, equipped with 2.7 billion parameters (connections between artificial neurons), showcases performance akin to significantly larger models such as Meta’s Llama 2-7B, which contains 7 billion parameters, and Mistral-7B, another model boasting 7 billion parameters.
Image source: Canva
In Microsoft's official blog post, Phi-2 is seen as a pursuit of smaller-scale language models to match the capabilities of larger ones.
Image source: Microsoft
The key strategies of researchers include prioritizing high-quality training data, focusing on textbook-quality content and synthetic datasets for common sense reasoning and general knowledge.
Image source: canva
They also enrich their dataset with meticulously selected web content emphasizing educational value.
Image source: canva
Microsoft innovatively leveraged knowledge transfer from Phi-1.5, a 1.3 billion parameter model, embedding its insight into the 2.7 billion parameter Phi-2. This technique not only expedites training but also substantially elevates Phi-2’s benchmark performance.
Image source: Microsoft
Going more into the model’s technicality, Phi-2 operates on a next-word prediction objective and underwent training on a massive 1.4 trillion tokens sourced from web datasets focusing on NLP and coding.
Image source: Github
@analyticsdrift
Produced by: Boudhayan Ghosh Designed by: Prathamesh