Google Gemini 3/3.1 + Gemma 4 Open-Source Multimodal Push + DiffusionGemma + Gemini Omni Flash

Key Questions

What are the main capabilities of Gemini 3/3.1?

Gemini 3/3.1 leads in multimodal tasks, file search, SymptomAI, and DeepMind SOTA math performance, including AlphaProof Nexus solving Erdős problems with Lean verification. It also introduces Gemini Omni and enterprise agents alongside Gemini Spark as a 24/7 agentic assistant planned for I/O 2026.

What is Gemma 4 and why is it notable?

Gemma 4 is a 12B encoder-free multimodal model released by Google DeepMind under Apache 2.0 license that runs on a 16GB laptop. It supports local agentic workflows and includes QAT variants optimized for mobile and laptop efficiency in ~1GB format.

How is Google advancing agentic AI systems?

Google introduces new RL orchestration via Maestro and agent protocols like MCP/A2A to accelerate agentic systems, alongside Gemini 3.5 Flash which bets on autonomous agents and software building. A case study with igot.ai shows multi-agent autonomous workforces achieving over 300% evidence discovery and 95% workflow completion on Google Cloud.

What new models and techniques has Google released for generation tasks?

Google released DiffusionGemma, a diffusion-based text generation model achieving 4x speedup on local hardware under Apache 2.0, and Gemini Omni Flash, a SOTA video generation and editing model with API availability. These challenge the autoregressive paradigm and expand multimodal capabilities.

What improvements address LLM reliability and math performance?

Google's 'faithful uncertainty' technique improves LLM reliability, while the LEAP harness wraps LLMs in Lean verifier feedback, raising Putnam solve rates from under 10% to 70%. DeepMind cautions that large-scale AI agent deployment remains unsafe today.

Gemini 3/3.1 leads multimodal/File Search/SymptomAI/DeepMind SOTA math. Gemma 4 12B open-source multimodal (Apache 2.0) runs on 16GB laptop, enabling local agentic workflows. Google AI Studio now 1.2M apps/week. DiffusionGemma achieves 4x speedup on local hardware, challenging autoregressive paradigm. Gemini Omni Flash SOTA video generation/editing. Apple integrates Gemini into Siri. DeepMind admits large-scale AI agent deployment unsafe today. New ViQ tokenizer achieves 20-70% training acceleration. JetSpec speculative decoding up to 9.64x speedup (open-source).

Sources (3)

Updated Jun 30, 2026

LLM Innovation Tracker