Google Gemma 4 Official Edge Multimodal Launch
Key Questions
What is Google Gemma 4?
Gemma 4 is a 26B MoE model with 25 tokens/second speed, officially released with a San Francisco demo event on April 17. It supports multimodal capabilities and is optimized for edge deployments.
Can Gemma 4 run natively on iPhone?
Yes, Gemma 4 supports native offline inference on iPhone via the Edge Gallery app. This enables mobile and privacy-focused SaaS applications.
What deployment options does Gemma 4 support?
It is optimized for low-cost edge deploys with wrappers on HF, Replicate, vLLM, and Ollama for B2C/B2B use. This fits into the open multimodal model surge.
Gemma 4 (26B MoE 25 tok/s) officially released w/SF demo event April 17, now native offline iPhone inference via Edge Gallery app for mobile/privacy SaaS; optimized for edge deploys enabling low-cost HF/Replicate vLLM/Ollama B2C/B2B wrappers amid open multimodal surge.