AI Model Release Tracker

3h ago

Claude Opus 4.8 Drops with 2M Context, Honesty Focus & Massive Funding

Anthropic just shipped Claude Opus 4.8 alongside a record $65B Series H at $965B post-money valuation.

New capabilities: 2M token context, stronger...

5h ago

AI Model Release Tracker · May 29 Daily Digest

Flagship Model Releases

🔥 Claude Opus 4.8: Anthropic released Claude Opus 4.8 with gains in honesty, coding flaw detection (4x fewer unflagged...

7h ago

Gemini 3.1 Flash and Pro Image Models Reach GA

Nano Banana 2 (Gemini 3.1 Flash Image) and Nano Banana Pro (Gemini 3 Pro Image) are now generally available.

Flash priced at $0.045/image, Pro at $0.134/image.
Flash adds video input for generating context-aware images and infographics.

7h ago

Claude Opus 4.8: Technical Breakdown Meets Real-World Tests

Technical system card analysis shows major upgrades in software engineering, agentic tool use, and knowledge work, plus reduced over-refusals and...

7h ago

Claude Opus 4.8 Makes Honesty Its Edge

Anthropic is leaning hard into radical honesty with Opus 4.8, ditching the people-pleasing era for a model that pushes back on bad logic and flags its...

7h ago

Claude Opus 4.8 Hits Three Angles: IPO Race, Mythos Tease, Copilot Rollout

Anthropic's Claude Opus 4.8 launch plays on multiple fronts at once.

IPO race: Debuts as Anthropic and OpenAI sprint toward public listings, with...

Anthropic debuts flagship Claude Opus 4.8 AI model as IPO race with OpenAI heats up

finance.yahoo.com

Anthropic debuts flagship Claude Opus 4.8 AI model as IPO race with OpenAI heats up

7h ago

11h ago

Qwen3.7-Max: Strong Agent Performance at Lower Cost

Qwen3.7-Max targets agent tasks like long-horizon coding and tool use with a 1M-token window.

Novel training splits problems into...

1d ago

AI Model Release Tracker · 2026-05-28 Daily Digest

No significant updates today.

2d ago

AI Model Release Tracker · 2026-05-27 Daily Digest

No significant updates today.

2d ago

EAGLE 3.1 Solves Attention Drift via EAGLE-vLLM-TorchSpec Collaboration

EAGLE 3.1 fixes attention drift in speculative decoding by adding FC normalization and post-norm hidden-state feedback, yielding up to 2× longer...

EAGLE 3.1: Advancing Speculative Decoding Through Collaboration Between the EAGLE Team, vLLM, and TorchSpec | vLLM Blog

vllm.ai

EAGLE 3.1: Advancing Speculative Decoding Through Collaboration Between the EAGLE Team, vLLM, and TorchSpec | vLLM Blog

2d ago

StepAudio 2.5 Realtime Tops Voice Benchmarks

StepAudio 2.5 Realtime swept all five April 2026 voice AI benchmarks, beating GPT Realtime 1.5 and Gemini Live with scores including 82.18 in...

StepFun’s StepAudio 2.5 Realtime tops voice AI benchmarks in April 2026

cryptobriefing.com

StepFun’s StepAudio 2.5 Realtime tops voice AI benchmarks in April 2026

2d ago

SenseNova U1 Launches as Open-Source Unified Multimodal Model

SenseNova U1 delivers a native unified architecture that handles understanding, reasoning, and generation without separate vision encoders or VAEs.

-...

2d ago

Gemini 3.5 Flash vs GPT-5.5: Developer Trade-offs

GPT-5.5 edges Gemini 3.5 Flash on agentic coding benchmarks while Gemini dominates tool-use and finance tasks at far lower cost.

Coding edge:...

Gemini 3.5 Flash vs GPT-5.5: Benchmarks, Features, Use Cases

datacamp.com

Gemini 3.5 Flash vs GPT-5.5: Benchmarks, Features, Use Cases

2d ago

3d ago

Claude Mythos Flags 23K Open-Source Vulnerabilities

Anthropic’s Claude Mythos Preview scanned 1,000+ open-source projects and flagged 23,019 potential vulnerabilities, including an estimated 6,202 high-...

Anthropic’s Claude Mythos Flags 23K Potential Open-Source Security Flaws

eweek.com

Anthropic’s Claude Mythos Flags 23K Potential Open-Source Security Flaws

3d ago

AlphaProof Nexus Shifts AI Math from Benchmarks to Open Problems

AlphaProof Nexus solves nine open Erdős problems and 44 OEIS conjectures via Lean-verified proofs, marking a move beyond benchmark scores toward...

Google DeepMind’s AlphaProof Nexus Solves Erdős Problems as AI Math Race Moves Beyond Benchmarks

winbuzzer.com

Google DeepMind’s AlphaProof Nexus Solves Erdős Problems as AI Math Race Moves Beyond Benchmarks

3d ago

WBench: New Benchmark for Video World Models

WBench delivers the first unified multi-turn benchmark for interactive video world models, addressing fragmented evaluation methods.

Spans five...

3d ago

AI Model Release Tracker · May 26 Daily Digest

New Model Releases

🔥 StepAudio 2.5: StepFun released StepAudio 2.5, a unified audio-language foundation model trained on 2.2T tokens that...

3d ago

Llama 4 Brings Native Multimodal Power to Open Source

Meta Llama 4 delivers native multimodal capabilities in an open-source model, processing vision, audio, and text together for cross-modal tasks like...