Google releases Gemma 4 12B open-source multimodal model for local deployment

Key Questions

What are the key features of Google's Gemma 4 12B model?

Gemma 4 12B is an encoder-free open-source multimodal model supporting 256K context, MTP, function calling, and thinking mode, with audio and video analysis capabilities.

What hardware is required to run Gemma 4 12B locally?

The model runs on typical 16GB RAM laptops, making it suitable for edge deployment and easily fitting in 32-64GB VRAM setups alongside the larger 26B variant.

Where can developers access Gemma 4 12B?

It is available on Kaggle Models and released by Google DeepMind as an open-source model for local deployment and experimentation.

New Gemma 4 12B from Google DeepMind is encoder-free, supports 256K context, MTP, function calling, and thinking mode. Runs on 16GB RAM, making it accessible for edge deployment and complementing the 26B variant. Open-source, fits in 32-64GB VRAM easily.

Sources (2)

Updated Jun 4, 2026

Open LLM Deploy

Google releases Gemma 4 12B open-source multimodal model for local deployment

Key Questions

What are the key features of Google's Gemma 4 12B model?

What hardware is required to run Gemma 4 12B locally?

Where can developers access Gemma 4 12B?

@kaggle: Gemma 4 12B is now on Kaggle Models! 🤖 Learn more: 👉 https://t.co/PNKZIh9vPC

Google's new open source Gemma 4 12B analyzes audio, video — and runs entirely locally on a typical 16GB enterprise laptop