Block-Level Experts and Token Teachability Drive Efficiency Gains
Two papers introduce selective mechanisms that cut waste in large-model training and inference.
- dMoE aggregates token-level routing into...

Created by Feituntunee
Cutting‑edge foundation models, AI deployments, safety research, and open‑source tools
Explore the latest content tracked by AI Frontier Digest
Two papers introduce selective mechanisms that cut waste in large-model training and inference.
AI foundation models are scaling across biology and medicine, moving from narrow tools to versatile systems that handle DNA, imaging, and preclinical...
OpenRouter's independent marketplace model has hit unicorn status at $1.3B after a $113M round led by Alphabet, offering access to 400+ models with...
Recent papers map rapid progress in agent capabilities alongside stubborn bottlenecks in skills, reasoning, and evaluation.
Four recent advances reveal a clear trend toward native, scalable multimodal models that eliminate separate encoders and quadratic scaling costs.
-...
Nvidia's Isaac GR00T bundles open foundation models like GR00T N1 with simulation tools for academic humanoid R&D , delivering multimodal reasoning via language, vision, and proprioception while locking in GPU infrastructure demand .
Hybrid diffusion transformers with targeted system optimizations now enable real-time streaming video editing at 24 end-to-end FPS for 1280x704...
Google's Gemini Omni introduces a single multimodal model that generates and edits videos from text, images, audio, or video inputs through natural...
Poolside's Model Factory approach enabled training their 225.8B Laguna M.1 and 33.4B XS.2 MoE models for agentic coding from scratch using over 30...
Asian court systems are rapidly adopting localized generative AI tools to automate administrative tasks like transcription and filing, with the legal...
Recent research tackles three distinct AI bottlenecks:
AI in oncology has evolved from early skepticism to practical tools enhancing workflows, trials, and early detection.
GPIC delivers a 28-trillion-pixel permissively licensed image corpus that directly tackles dataset scale and licensing barriers, enabling wider...
DeepSeek's first native multimodal model finally adds vision to the open-source series, removing the friction of combining separate text and vision...
Google's I/O 2026 releases form a complementary stack: Gemini Embedding 2 unifies text, image, video, audio, and PDF retrieval in one semantic space,...
Two new papers chart both momentum and limits in turning video generation into interactive world models.
Two fresh papers address the infrastructure gap for capable agents.