Friday, November 22, 2024
ad
HomeNewsNVIDIA Introduces a Miniaturized Version of Mistral NeMo 12B

NVIDIA Introduces a Miniaturized Version of Mistral NeMo 12B

Mistral-NeMo-Minitron 8B is a scaled-down form of the Mistral NeMo 12B model. It was prepared by reducing the parameter size from 12 billion to 8 billion through pruning. The model was then distilled to enhance its accuracy using NVIDIA NeMo, a platform for developing generative AI applications.

On August 21, 2024, NVIDIA announced the release of Mistral-NeMo-Minitron 8B, a small language model(SLM), and a miniature version of the earlier released Mistral NeMo 12B model. NVIDIA unveiled the Mistral NeMo 12B model, a cutting-edge LLM, on July 18, 2024. It was developed through a collaboration between NVIDIA and Mistral AI. The model can be deployed in enterprise applications to support chatbots, summarization, and multilingual tasks. 

To learn more about Mistral-NeMo-Minitron 8B, click here.

Mistral-NeMo-Minitron 8B, with 8 billion parameters, is a scaled-down version of Mistral NeMo 12B, which has 12 billion parameters. It is a small language model, a specialized AI model trained on datasets smaller than those used for LLMs. 

SLMs are usually curated to perform specific tasks like sentiment analysis, basic text generation, and classification. They can run in real-time on workstations and laptops. Small organizations with limited resources for LLM infrastructure can easily deploy SLMs to leverage generative AI capabilities at lower costs. 

Read More: NVIDIA’s fVDB Transforms Spatial Intelligence for Next-Gen AI

Mistral-NeMo-Minitron 8B is small enough to run on an NVIDIA RTX-powered workstation. At the same time, it excels across various benchmarks for virtual assistants, chatbots, coding, and education applications. 

Bryan Catanzaro, NVIDIA’s vice president of applied deep learning research, said, “We have combined two AI optimization methods in this model. One is pruning to reduce parameters from 12 billion to 8 billion, and the other is distillation to transfer learnings of the Mistral NeMo 12B model to the Mistral-NeMo-Minitron 8B model. This helps the model to deliver accurate results similar to LLM at lower computational costs.”

The model development team first performed the pruning process, which condenses the size of the neural network by removing model weights that contribute the least to its accuracy. The pruned model was then retrained during distillation to compensate for the reduction in accuracy caused by pruning. 

The distillation process for the Mistral-NeMo-Minitron 8B model has been performed using NVIDIA NeMo, a platform for developing generative AI applications. Developers can further compress this model for smartphone use by performing distillation and pruning using NVIDIA AI Foundry.  This compressed model is built using a fraction of parent models’ training data and infrastructure but offers high accuracy. 

NVIDIA has emerged as a significant player among the companies offering AI services. Its products, especially AI chips, are increasingly being adopted for various applications, resulting in a surge in the company’s share value by 170% in the current year. With the launch of Mistral-NeMo-Minitron 8B, NVIDIA’s strategy to diversify its AI services will gain further momentum.

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Analytics Drift
Analytics Drift
Editorial team of Analytics Drift

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular