Big tech new AI model launches

Key Questions

What is Meta's Muse Spark model?

Meta's Muse Spark, led by Alexandr Wang, is a multimodal/agentic model that tops SWE-Bench app leaderboard at #6. It is open-source imminent and performs well on real-world tasks.

How does Google Gemma 4 perform in benchmarks versus agent tests?

Gemma 4 has over 10 million downloads per week and excels in benchmarks, described as 'genius' level. However, agent testing reveals it acts like an 'intern,' ignoring prompts and context.

What other notable AI model developments were mentioned?

Xiaomi's MiMo-V2-Pro tops leaderboards, with surges in Zhipu and MiniMax due to AI optimism. Claude 5 rumors circulate, and real-world evaluations challenge benchmark results.

Meta Muse Spark (Wang lead, multimodal/agentic tops SWE-Bench app #6, OSS imminent); Xiaomi MiMo-V2-Pro top leaderboard; Google Gemma 4 10M+ dl/wk (bench genius but agent tests flop: ignores prompts/context); Claude 5 rumors; Zhipu/MiniMax surges; real-world evals challenge benchmarks.

Sources (2)

Updated Apr 10, 2026

Radar IA Startups

Big tech new AI model launches

Key Questions

What is Meta's Muse Spark model?

How does Google Gemma 4 perform in benchmarks versus agent tests?

What other notable AI model developments were mentioned?

gemma 4 benchmarks say genius. my agent testing says intern. | techwizardrino | Webmatrices

@alexandr_wang reposted: @fchollet Actually performs quite well on held out real world tasks! Especially ...