Google Announces The Launch of Gemma2 For Improved AI Capabilities

Google Announces The Launch of Gemma2 For Improved AI Capabilities

Categories :

By Pallavi Singal

Google DeepMind introduces Gemma 2, a new generation of open AI models designed to enhance performance and efficiency across diverse hardware platforms. Available in 9 billion and 27 billion parameter sizes, Gemma 2 aims to revolutionise AI deployment with its cost-effective, high-performance capabilities.

Google Announces The Launch of Gemma2 For Improved AI Capabilities

Google DeepMind has launched Gemma 2, a new generation of open AI models that provide improved performance metrics and a collection of lightweight. This is an advanced open model derived from the same research and technology used to create the Gemini models. The Gemma family has expanded to include CodeGemma, RecurrentGemma, and PaliGemma, each designed for specific AI tasks and accessible through integrations with partners like Hugging Face, NVIDIA, and Ollama.

Gemma 2 is available in 9 billion (9B) and 27 billion (27B) parameter sizes, surpassing the first generation in both performance and efficiency during inference, while also incorporating significant safety enhancements. The 27B model provides competitive alternatives to models more than twice its size. This advanced capability is now possible on a single NVIDIA H100 Tensor Core GPU or TPU host, greatly reducing deployment costs.

Google launches Gemma 2 for redefining AI performance and efficiency

Gemma 2 represents AI architecture, crafted for exceptional performance and efficiency in inference. At 27 billion parameters, Gemma 2, offers performance that competes with models more than twice its size. Even the 9 billion parameter version of Gemma 2 surpasses other open models in its category, including Llama 3 8B. For detailed performance metrics, refer to the technical report.

The 27B Gemma 2 model is designed to execute inference tasks efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. This capability significantly reduces deployment costs while maintaining high performance, making AI deployments more accessible and budget-friendly.

Gemma 2 is optimised for rapid inference across diverse hardware environments, from advanced gaming laptops and high-performance desktops to cloud-based setups. Users can experience Gemma 2 in full precision on Google AI Studio, and unlock local performance using the quantised version with Gemma.cpp on their CPU, or deploy it on personal devices such as NVIDIA RTX or GeForce RTX via Hugging Face Transformers.

Gemma 2 by Google enhances accessibility: Tailored for developers and researchers

Gemma 2 isn't just more powerful; it's designed to seamlessly integrate into user workflows:

Open and Accessible: Gemma 2 is available under the commercially friendly Gemma license to empower developers and researchers to share and commercialise their innovations with ease.

Wide Framework Compatibility: Gemma 2 is compatibile with major AI frameworks such as Hugging Face Transformers, JAX, PyTorch, and TensorFlow by native Keras 3.0, vLLM, Gemma.cpp, Llama.cpp, and Ollama. This facilitates an effortless integration with the preferred tools and workflows of users.

Gemma is optimised with NVIDIA TensorRT-LLM for running on NVIDIA-accelerated infrastructure or as an NVIDIA NIM inference microservice, with plans for optimisation for NVIDIA’s NeMo in the pipeline.

Effortless Deployment: Beginning next month, Google Cloud customers can effortlessly deploy and manage Gemma 2 on Vertex AI, streamlining the deployment process.

Gemma 2 models can also be fine tuned for specific tasks, including practical examples and recipes in the new Gemma Cookbook, designed to guide users through building applications.

Advancing responsible AI with Gemma 2

The Responsible Generative AI Toolkit by Google now includes the newly open-sourced LLM Comparator. This tool assists developers and researchers in conducting thorough evaluations of language models.

Users can utilise the companion Python library to perform comparative assessments using their models and data, visualising results within the application. Preparations are also underway to open source SynthID, a text watermarking technology tailored for Gemma models.

During the development of Gemma 2, Google committed to internal safety protocols. This included filtering pre-training data and subjecting the model to extensive testing and evaluation against a comprehensive range of metrics to detect and mitigate potential biases and risks. Google's findings are openly shared across a wide array of public benchmarks concerning safety and representational harms.

Gemma 2 by Google empowering future innovation

The original Gemma release sparked over 10 million downloads and inspired numerous groundbreaking projects. Navarasa used Gemma to create a model that celebrates India’s rich diversity.

Google continues to explore new architectures and develop specialised variants of Gemma to address an expanded spectrum of AI challenges. This includes the upcoming 2.6 billion parameter Gemma 2 model, designed to further bridge the gap between lightweight accessibility and high-performance capabilities.

Gemma 2 is accessible on Google AI Studio, enabling users to test its full capabilities at 27 billion parameters without specific hardware requirements. Model weights for Gemma 2 can be downloaded from platforms like Kaggle and Hugging Face Models, with availability soon in the Vertex AI Model Garden.

To facilitate research and development access, Gemma 2 is also offered free of charge through Kaggle and by a free tier for Colab notebooks. Additionally, first-time Google Cloud customers may qualify for $300 in credits. Academic researchers can apply for the Gemma 2 Academic Research Program, offering Google Cloud credits to accelerate research efforts.

Tags

How to Improve Workplace Efficiency by Streamlining Employee Management

How to Improve Workplace Efficiency by Streamlining Employee Management

Sep 04, 2024
The essentials of loan origination: what borrowers need to know

The essentials of loan origination: what borrowers need to know

Sep 02, 2024
Dog Attacks: When Owners and Carers Face Criminal Offense

Dog Attacks: When Owners and Carers Face Criminal Offense

Sep 02, 2024
How to Organize a Multi-Generational Family Trip to Gatlinburg

How to Organize a Multi-Generational Family Trip to Gatlinburg

Aug 30, 2024
What You Need to Know When Buying a Gun in North Carolina

What You Need to Know When Buying a Gun in North Carolina

Aug 30, 2024
Protecting your Digital World: How To Create Stronger Passwords

Protecting your Digital World: How To Create Stronger Passwords

Aug 29, 2024
How Website Design Affects Conversion

How Website Design Affects Conversion

Aug 27, 2024
Top Promotional Tactics for Businesses Expanding to New Urban Areas

Top Promotional Tactics for Businesses Expanding to New Urban Areas

Aug 27, 2024
The Shift to the Cloud: A New Era for Digital Forensics

The Shift to the Cloud: A New Era for Digital Forensics

Aug 26, 2024
Where to get disposable phone numbers?

Where to get disposable phone numbers?

Aug 26, 2024