DeepMind took a major leap forward in AI innovation by launching a 2 billion-parameter model for its Gemma 2 family.
As the demand for advanced AI grows, there is a need for models that balance high performance with accessibility across various platforms. Many existing models are too resource-intensive for widespread use, limiting their application to high-end infrastructure.
To address this gap, Google DeepMind has introduced the Gemma 2 2B model to deliver outsized results. This article highlights the significance of the new addition to the Gemma 2 model family.
Inside Google DeepMind: A Short Glimpse into the Future of Technology
Google DeepMind, a subsidiary of Google, is a cutting-edge AI research lab renowned for deep learning and reinforcement learning tasks. It gained global recognition in 2016 when its AlphaGo program defeated world champions in the Go game. Following this notable achievement, DeepMind has continued to innovate with a series of AI language models, including Gato, Sparrow, Chinchilla, Gemini, and more.
Gemma: A Game Changer in AI-Language Models
On February 21st, 2024, DeepMind’s Gemma launched with a 7 billion parameter size suitable for desktop computers and small servers. Gemma is a family of lightweight, open-source large language models built on the same research and technology used to create Google Gemini. It is a text-to-text, decoder-only AI model available in English and comes with open weights for instruction-tuned and pre-trained versions.
Read More: Harnessing the Future: The Intersection of AI and Online Visibility
Gemma’s second generation, Gemma 2, was released in June. It includes two sizes: 9 billion (9B) parameters for higher-end desktop PCs and 27 billion (27B) parameters for large servers or server clusters.
To mark a leap forward in AI innovation, DeepMind announced a new 2 billion (2B) parameter version of the Gemma 2 model on July 31st, 2024. The Gemma 2 series’ 2B parameter model is designed for CPU usage and on-device applications. It has a more compact parameter size than the 9B and 27B versions. Still, it can deliver best-in-class performance for various text generation tasks, including question answering, summarization, and reasoning.
Top-Tier Performance of 2B Gemma 2 Parameter Model
The 2B Gemma 2 parameter model offers powerful capabilities for the generative AI field. Here are some key highlights:
- Flexible: The Gemma 2 model, with its 2 billion parameters, can run efficiently on a wide range of hardware platforms. These include data centers, local workstations, laptops, edge computing devices, and cloud platforms with Vertex AI and Google Kubernetes Engine (GKE).
- Integration for Streamlined Development: Gemma 2 2B allows you to integrate seamlessly with Keras, Hugging Face, NVIDIA Nemo, Ollama, and Gemma.cpp. It will soon support MediaPipe.
- Exceptional Performance: The company claims that Gemma 2 2B outperforms all GPT-3.5 models on the LMSYS Chatbot Arena leaderboard, a benchmark for evaluating AI chatbot performance.
- Open Standard: The 2B model is available under commercial-friendly Gemma terms for commercial and research use.
- Easily Accessible: The 2B Gemma 2 model’s lightweight design allows it to operate on the free tier of the NVIDIA T4 deep learning accelerator in Google Colab. This makes advanced AI accessible for experimentation and development without requiring high-end hardware.
- Improved Efficiency: Gemma 2 2B has been optimized using NVIDIA’s TensorRT-LLM library to improve efficiency and speed during inference.
- Continuous Learning through Distillation: The 2B model leverages knowledge distillation, learning from larger models by mimicking their behavior. This allows the new parameter model to achieve impressive performance despite its smaller size.
A Quick Look At Gemma 2B Model Training, Preprocessing, and Evaluation
The dataset for training Gemma 2 models includes web documents, mathematical text, code, and more. The 2B parameter model was trained on 2 trillion tokens using Tensor Processing Unit (TPU) hardware, JAX, and ML Pathways. To ensure quality, rigid preprocessing methods, such as CSAM filtering and sensitive data filtering, were applied.
The 2B model was evaluated based on text generation benchmarks, such as MMLU, BoolQ, MATH, HumanEval, and more. It was also assessed for ethics and safety using structured evaluations and internal red-teaming testing methods.
Gemma 2B Model Intended Usage
- Text Generation: The 2B model helps in creating various types of content, including poems, scripts, code, marketing materials, email drafts, and so on.
- Text Summarization: The 2B Gemma 2 model can produce concise summaries for research papers, articles, text corpus, or reports.
- Chatbots and Conversational AI: Enhance conversational interfaces for customer service, virtual assistants, and interactive applications.
- NLP Research: The 2B model provides a foundation for researchers to test Natural Language Processing (NLP) techniques, develop algorithms, and advance the field.
- Language Learning Tools: The model facilitates interactive language learning, including grammar correct and writing practice.
- Knowledge Exploration: The Gemma 2B model enables researchers to analyze large text collections and generate summaries or answer specific questions.
New Additions to Gemma 2 Model
DeepMind is adding two new models to the Gemma 2 family. Let’s take a brief look at them:
- ShieldGemma: It consists of safety classifiers designed to identify and manage harmful content in AI model inputs and outputs. ShieldGemma is available in various sizes; it targets hate speech, harassment, sexually explicit material, and dangerous content.
- Gemma Scope: Gemma Scope is focused on transparency. It features a collection of sparse autoencoders (SAEs), specialized neural networks that clarify the complex inner workings of the Gemma 2 models. These SAEs help users understand how the models process information and make decisions. There are more than 400 freely scalable SAEs covering all layers of the Gemma 2 2B model.
How to Get Started?
To get started, download Gemma 2 2B from Kaggle, Hugging Face, and Vertex AI Model Garden, or try its features through Google AI Studio.
Key Takeaways
Google DeepMind has upgraded the Gemma 2 model with a new 2 billion parameter version. Released on July 31st, 2024, this model is designed for on-device applications, offering efficient performance in tasks like text generation, summarization, and reasoning. It operates well on diverse hardware platforms, including local workstations and cloud services. The Gemma 2 2B model is optimized with NVIDIA’s TensorRT-LLM library and utilizes model distillation for improving performance.