MiniMax M3: Architecture Hype vs. Unverified Claims
MiniMax M3 launched with MSA sparse attention for efficient 1M-token context, claiming 9.7× prefill and 15.6× decode speedups over M2.
- Vendor...

Created by Cheng Niu
Open‑source and flagship AI model releases, benchmarks, safety notes across LLMs, vision, speech, multimodal
Explore the latest content tracked by AI Model Release Tracker
MiniMax M3 launched with MSA sparse attention for efficient 1M-token context, claiming 9.7× prefill and 15.6× decode speedups over M2.
Midjourney rolled out key upgrades in May, expanding conversational prompting and adding the --no parameter to V8.1.
Anthropic's Opus 4.8 introduces dynamic workflows where Claude autonomously plans repository-scale migrations, writes its own orchestration scripts,...
OpenAI is offering GPT-5.5-Cyber access to nine UK banks via its TAC program, capitalizing on Anthropic's ongoing restrictions around the stronger...
LTX2 is the first open-source DiT-based audio-video foundation model, delivering synchronized high-fidelity video and audio generation that runs...
Nvidia's Cosmos 3 introduces a mixture-of-transformers architecture that reasons about object interactions and motion before generating video and...
MiniMax M3 launches as a 229.9B-parameter MoE model with only 9.8B active params, a 1M-token context window, and weights releasing in 10 days. It...
Nvidia launched two open models signaling a broad push into general and vertical AI.
AWS Bedrock now hosts Claude Opus 4.8, previews OpenAI models and Codex, plus managed agents, materially reducing enterprise integration friction and signaling wider production genAI adoption.
A new arXiv paper introduces COLLEAGUE.SKILL, an open-source system that automatically converts expert traces into versioned, inspectable AI skill packages for LLM agents. The repo already boasts 18.5k GitHub stars and 215 community skills.
New dMoE framework resolves the token-vs-block mismatch in dLLM MoE models by aggregating expert distributions at the block level. It cuts uniquely...
New arXiv preprint introduces SANA-Streaming, a hybrid diffusion transformer for real-time streaming video-to-video editing.
A new arXiv preprint proposes Representation Forcing (RF), a training technique that lets unified multimodal models generate images directly in pixel...
VLMs are native 3D learners — focal length unification, text-based pixel references, and data scaling suffice for strong 3D performance without...
SwanVoice delivers a new zero-shot TTS model supporting expressive long-form synthesis for 1–4 speakers in both monologue and dialogue, trained...
The new DMind Benchmark, accepted at KDD 2026, tested 31 models including GPT-5, Claude, and Gemini across 3,543 questions in nine Web3 domains.
-...
Anthropic is publicly releasing Mythos, the model it previously withheld via Project Glasswing due to its ability to autonomously find and exploit...