AI Innovation Pulse

Multimodal/research: GPT-Image-2/Meta Wang OSS/VLMs/LIBERO-Para/Agentic-MME/3D GaussianGPT/math/Altman/LeCun evals

Multimodal/research: GPT-Image-2/Meta Wang OSS/VLMs/LIBERO-Para/Agentic-MME/3D GaussianGPT/math/Altman/LeCun evals

Key Questions

What is GPT-Image-2?

A leaked OpenAI GPT-Image-2 model on Arena shows advanced capabilities, described as 100% production-ready.

What are Meta's new OSS models under Wang?

Meta plans to open-source versions of new models led by Alexandr Wang, with reorg for safety.

What is LIBERO-Para?

LIBERO-Para is a diagnostic benchmark for paraphrase robustness in VLA models.

What biases affect VLMs?

VLMs show semantic and geometric biases; papers explore tokenization vs. continuous geometry alignment tax.

What is Agentic-MME?

Agentic-MME evaluates agentic capabilities in multimodal models alongside ViGoR-Bench for visual reasoning.

What are GaussianGPT and EgoSim?

GaussianGPT and EgoSim advance 3D generation and egocentric world simulation for embodied interaction.

What math benchmarks involve Erdős and ARC-AGI-4?

Gemini posts solutions to ARC-AGI-4; focuses on math reasoning like Erdős problems.

What are views from Altman and LeCun on future AI?

Altman predicts post-transformers era; LeCun advances LpJEPA and Gemma4 evals, addressing paper hallucinations and DARPA zero agents.

Meta new OSS models (Wang reorg/safety); LIBERO-Para VLA paraphrase robustness; GPT-Image-2 leak; VLMs semantic/geometric bias; Agentic-MME; GaussianGPT/EgoSim; Erdős/ARC-AGI-4; Altman post-transformers; LeCun LpJEPA/Gemma4; paper hallucinations/DARPA zero agent.

Sources (13)
Updated Apr 8, 2026