www.analyticsdrift.com
Image source: Analytics Drift
After releasing the Mistral NeMo 12B model, in collaboration with Mistral AI, NVIDIA has now introduced its compressed version, Mistral-NeMo-Minitron 8B, a small language model.
Image source: NVIDIIA
Mistral Nemo 12B, a 12-billion-parameter LLM, was developed for enterprise apps supporting chatbots, summarization, or multilingual tasks.
Image source: NVIDIIA
An SLM is a specialized AI model trained on a smaller dataset than LLMs. It performs specific tasks, such as marketing automation or customer service.
Image source: Canva
Bryan Catanzaro, NVIDIA's VP for applied deep learning research, said, “We have used pruning to shrink 12 billion parameters to 8 billion. Distillation was used to improve model accuracy.”
Image source: NVIDIA
Mistral-NeMo-Minitron 8B runs on an NVIDIA RTX-powered workstation and uses NVIDIA NeMo, a GenAI application development platform for distillation, to transfer learnings from LLM to SLM.
Image source: NVIDIA
Small enterprises can deploy SLMs with limited resources, achieving LLM-like accuracy at lower costs. Mistral-NeMo-Minitron 8B was developed with this goal in mind.
Image source: NVIDIA
Developers can further compress the Mistral-NeMo-Minitron 8B model to run it on smartphones using NVIDIA AI Foundry, which facilitates pruning and distillation.
Image source: NVIDIA