AI Model Release Tracker

****************Google Gemma 4 open multimodal/agentic family + edge SOTAs + INT4 quants****************

****************Google Gemma 4 open multimodal/agentic family + edge SOTAs + INT4 quants****************

Key Questions

What is Google Gemma 4?

Google Gemma 4 is a family of open-weight multimodal and agentic AI models released by Google DeepMind under Apache 2.0 license. It includes variants like E2B, E4B, 26B MoE, and 31B dense models, supporting text, image, and audio inputs with demos on Hugging Face, mobile, and YouTube.

What sizes are available in the Gemma 4 family?

The Gemma 4 family consists of four variants: effective edge models E2B and E4B, a 26B Mixture-of-Experts (MoE) model with about 4B active parameters, and a 31B dense model. These are designed for various hardware, including consumer GPUs like RTX 4090.

What license does Gemma 4 use?

Gemma 4 models are fully released under the Apache 2.0 license, allowing broad commercial and research use. They are available on Hugging Face with trending downloads.

What are the key benchmark scores for Gemma 4?

Gemma 4 achieves GPQA 85.7% and AIME 89%. In MLX 8-bit, it scores MMMU-Pro 76.9% and MATH-Vision 85.6%, outperforming larger proprietary models.

Can Gemma 4 run on consumer hardware?

Yes, the 26B MoE model runs on a single RTX 4090 with 162 tokens/second decode speed and 8,400 tokens/second prompt processing. The 31B model requires 24GB VRAM in Q4 quantization.

What is Google AI Edge Eloquent?

Google AI Edge Eloquent is a free on-device voice recognition tool quietly released for iOS. It enables voice processing without cloud dependency.

Is Gemma 4 multimodal?

Yes, Gemma 4 is multimodal, handling text and image inputs, with support for audio and text inference as detailed in its model card. It represents frontier multimodal intelligence on device.

How popular is Gemma 4?

Gemma 4 is #1 on Hugging Face trending and downloads. It has numerous YouTube demos and developer guides highlighting its performance.

Gemma 4 E2B/E4B/26B/31B Apache 2.0 (HF/mobile/YT demos, GPQA 85.7%/AIME 89%/MMMU-Pro 76.9%); #1 HF trending; INT4 quantized models now on HF via GoogleAI/IntelAI for edge inference; + AI Edge Eloquent iOS voice recog.

Sources (27)
Updated Apr 8, 2026