NVIDIA Unveils Mistral-NeMo-Minitron 8B, a Miniaturized Version of MistralAI NeMo 12B Model

www.analyticsdrift.com

Image source: Analytics Drift

After releasing the Mistral NeMo 12B model, in collaboration with Mistral AI, NVIDIA has now introduced its compressed version, Mistral-NeMo-Minitron 8B, a small language model.

Image source: NVIDIIA

NVIDIA launches Mistral-NeMo-Minitron 8B model

Mistral Nemo 12B, a 12-billion-parameter LLM, was developed for enterprise apps supporting chatbots, summarization, or multilingual tasks.

Image source: NVIDIIA

MistralAI NeMo 12B: A cutting-edge enterprise AI model

An SLM is a specialized AI model trained on a smaller dataset than LLMs. It performs specific tasks, such as marketing automation or customer service.

Image source: Canva

What is a small language model (SLM)?

Bryan Catanzaro, NVIDIA's VP for applied deep learning research, said, “We have used pruning to shrink 12 billion parameters to 8 billion. Distillation was used to improve model accuracy.”

Image source: NVIDIA

Pruning and Distillation for Model Optimization

Mistral-NeMo-Minitron 8B runs on an NVIDIA RTX-powered workstation and uses NVIDIA NeMo, a GenAI application development platform for distillation, to transfer learnings from LLM to SLM.

Image source: NVIDIA

Leveraging NVIDIA RTX and NeMo for optimized performance

Small enterprises can deploy SLMs with limited resources, achieving LLM-like accuracy at lower costs. Mistral-NeMo-Minitron 8B was developed with this goal in mind.

Image source: NVIDIA

Why Mistral-NeMo-Minitron 8B was developed?

Developers can further compress the Mistral-NeMo-Minitron 8B model to run it on smartphones using NVIDIA AI Foundry, which facilitates pruning and distillation.

Image source: NVIDIA

Customization using NVIDIA AI Foundry