Tuesday, September 17, 2024
ad
HomeNewsMicrosoft Announced New Cutting-Edge Phi-3.5 Model Series

Microsoft Announced New Cutting-Edge Phi-3.5 Model Series

Microsoft unveils Next-Generation Phi-3.5 models, Mini, MoE, and Vision, which are set to redefine the future of technology and are expected to outperform their rivals.

Microsoft expanded its Small Langauge Models (SLMs) lineup by launching the Phi-3 collection in April 2024. Phi-3 models delivered advanced capabilities and cost efficiency, surpassing similar and larger models across key language, reasoning, coding, and math benchmarks. These models received valuable customer and community feedback, driving further AI adoption. 

https://t.co/dFfyktuEUL

In August 2024, Microsoft proudly introduced its latest AI innovation, the Phi-3.5 series. This cutting-edge collection features three open-source SLMs: a 3.82 billion parameter mini-instruct, a 4.15 billion vision-instruct, and a 41.9 billion MoE-instruct. These models support a 128k token context length and show that performance is not solely determined by size in the world of Generative AI.

https://t.co/I8NiWTh5Q2

The lightweight AI model Phi-3.5-mini-instruct is well suited for code generation, mathematical problem-solving, and logic-based reasoning tasks. Despite its small size, the mini version surpasses the Llama-3.1-8B-instruct and Mistral-7B-instruct models on the RepoQA benchmark for long context code understanding.

Read More: Top Robots in India in 2024 

Microsoft’s Mixture of Experts (MoE) merges multiple model types, each focusing on various reasoning tasks. According to Hugging Face documentation, MoE runs only with 6.6B parameters instead of the entire 42 billion active parameters. MoE model provides robust performance in code, math, and multilingual language understanding. It outperforms GPT-4o mini on the different benchmarks across various subjects, such as STEM, social sciences, and humanities, at different levels of expertise.  

The advanced multimodal Phi-3.5 vision model integrates text and vision processing capabilities. It is designed for general image understanding, chart and table comprehension, optical character recognition, and video summarization. 

For ten days, the Phi-3.5 mini model was trained on 3.4 trillion tokens using 512 H100-80G GPUs. MoE underwent training on 4.9 trillion tokens over 23 days, and the vision model used 500 billion tokens with 256 A100-80G GPUs for six days. 

All three models are free for developers to download, use, and customize on Hugging Face under Microsoft’s MIT license. By providing these models with open-source licenses, Microsoft enables developers to incorporate advanced AI features into their applications.

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Analytics Drift
Analytics Drift
Editorial team of Analytics Drift

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular