Microsoft expanded its Small Langauge Models (SLMs) lineup by launching the Phi-3 collection in April 2024. Phi-3 models delivered advanced capabilities and cost efficiency, surpassing similar and larger models across key language, reasoning, coding, and math benchmarks. These models received valuable customer and community feedback, driving further AI adoption.
In August 2024, Microsoft proudly introduced its latest AI innovation, the Phi-3.5 series. This cutting-edge collection features three open-source SLMs: a 3.82 billion parameter mini-instruct, a 4.15 billion vision-instruct, and a 41.9 billion MoE-instruct. These models support a 128k token context length and show that performance is not solely determined by size in the world of Generative AI.
The lightweight AI model Phi-3.5-mini-instruct is well suited for code generation, mathematical problem-solving, and logic-based reasoning tasks. Despite its small size, the mini version surpasses the Llama-3.1-8B-instruct and Mistral-7B-instruct models on the RepoQA benchmark for long context code understanding.
Read More: Top Robots in India in 2024
Microsoft’s Mixture of Experts (MoE) merges multiple model types, each focusing on various reasoning tasks. According to Hugging Face documentation, MoE runs only with 6.6B parameters instead of the entire 42 billion active parameters. MoE model provides robust performance in code, math, and multilingual language understanding. It outperforms GPT-4o mini on the different benchmarks across various subjects, such as STEM, social sciences, and humanities, at different levels of expertise.
The advanced multimodal Phi-3.5 vision model integrates text and vision processing capabilities. It is designed for general image understanding, chart and table comprehension, optical character recognition, and video summarization.
For ten days, the Phi-3.5 mini model was trained on 3.4 trillion tokens using 512 H100-80G GPUs. MoE underwent training on 4.9 trillion tokens over 23 days, and the vision model used 500 billion tokens with 256 A100-80G GPUs for six days.
All three models are free for developers to download, use, and customize on Hugging Face under Microsoft’s MIT license. By providing these models with open-source licenses, Microsoft enables developers to incorporate advanced AI features into their applications.