AI Daily Brief

3h ago

Perception Adaptation: Dynamic Scenes and Sensor Conversion

Two papers tackle perception robustness for autonomous systems:

MOSAIC-GS reconstructs complex dynamic scenes from monocular video via geometric...

3h ago

Multimodal LLMs: Capability Gains Meet Bias Risks

LatentOmni shows unified audio-visual latent reasoning outperforms explicit text CoT by preserving dense sensory signals and temporal consistency via...

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

arxiv.org

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

3h ago

AI Hacking Models Jolting Washington

Mythos and GPT-5.5 show "game-changer" hacking abilities, finding thousands of vulnerabilities in major OS and browsers faster than expected.
-...

What to know about the AI models that are jolting Washington

politico.com

What to know about the AI models that are jolting Washington

3h ago

FlowLong Generates Long Videos at Inference Time

FlowLong enables long, coherent video generation at inference time without retraining by blending predictions across overlapping windows via...

FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching

arxiv.org

FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching

3h ago

Heterogeneous Ensembles Signal End of Monolithic Models

Callosum shows mixing Qwen 3 VL8B with Kimi K2.5 beats GPT/Gemini by 18-25% on Video Web Arena at 3.7x lower cost
Routing subtasks to smaller...

3h ago

Dual Tracks to Cheaper LLM Inference

Two complementary strategies target LLM serving costs:

KVServe dynamically compresses KV caches for disaggregated setups, cutting TTFT up to 32.8×...

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving

arxiv.org

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving

3h ago

RL Sharpens LLM Agents for Real Spreadsheet Work

Spreadsheet-RL introduces an RL framework that trains LLM agents inside a realistic Microsoft Excel environment.
An automated pipeline harvests...

Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

arxiv.org

Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

3h ago

March 19, 2026

AI Daily Brief · Mar 19 Daily Digest

Efficient Language Models

🔥 Mamba-3: Mamba-3 introduces improvements to sequence modeling using state space principles, including expressive...

March 18, 2026

LLM Agents Tackle Post-Training and Heavy Research Automation

Trend spotlight: LLM agents are advancing toward autonomous R&D.

PostTrainBench evaluates agents like Claude Code automating fine-tuning on...

March 18, 2026

Industry Shift: Beyond Fine-Tuning to Early Data Mixing for Domain Efficiency

Key trend in production AI adaptation:

Industry defaults to fine-tuning on small datasets as it seems cheaper, ignoring inference costs.
Mixing...

March 18, 2026

Scaling Robot Autonomy: Safety Guarantees and Data Innovations Beyond Teleop

Key trend in robotics: Teleop data is expensive and hard to scale, pushing alternatives like simulation, human videos, AC-WMs, and WAMs.

-...

March 18, 2026

Mamba-3: Key SSM Advances for Transformer-Beating Efficiency

Mamba-3 tackles Transformer inference woes with three innovations:

Expressive recurrence from SSM discretization, complex-valued updates, and MIMO...

March 18, 2026

HorizonMath: Auto-Verifying LLMs Toward Math Discovery

HorizonMath launches a benchmark for AI mathematical discovery:

Over 100 unsolved problems in computational/applied math, immune to data...

March 18, 2026

FinToolBench: New Benchmark for LLM Agents in Finance

Key highlights from the new paper:

Introduces FinToolBench, a benchmark evaluating LLM agents on real-world financial tool use
Targets...

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

arxiv.org

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

March 18, 2026

Entropy-Aware Decoding Tackles Hallucinations in Multimodal Models

New paper proposes Latent Entropy-Aware Decoding for MLRMs to mitigate hallucinations by thinking in uncertainty—a practical step toward reliable multimodal reasoning.

Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

arxiv.org

Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

March 18, 2026

Masked Modeling Streamlines Image Pre-Training for UMMs

Rethinking UMM visual generation via masked modeling enables efficient image-only pre-training, shifting toward cost-effective multimodal systems.

Rethinking UMM Visual Generation: Masked Modeling for Efficient Image-Only Pre-training

arxiv.org

Rethinking UMM Visual Generation: Masked Modeling for Efficient Image-Only Pre-training

March 18, 2026

Attention Residuals: Fixing Deep LM Depth Limits with Selective Aggregation

New paper swaps static residuals for selective depth-wise attention across layers, tackling fundamental flaws in how traditional nets accumulate info...

March 18, 2026

AI Daily Brief · Mar 18 Daily Digest

Transformer Architecture Innovations

🔥 Moonshot AI's Attention Residuals: Moonshot AI released Attention Residuals repo as a drop-in...

March 18, 2026

Why AI Lacks Autonomous Learning: Cognitive Science View

AI systems don't truly learn, lacking the autonomous capabilities highlighted in cognitive science. This provocative take sparked 62 points of discussion on Hacker News, urging a rethink of data-driven paradigms versus human cognitive autonomy.

Why AI systems don't learn – On autonomous learning from cognitive science

March 18, 2026·

news.ycombinator.com

March 18, 2026

VLMs Fight Compute Asymmetry: Penguin-VL and Transfusion Lead Efficiency Charge

Emerging VLM trend targets compute asymmetry between vision and language for scalable efficiency:

Penguin-VL deploys LLM-based vision encoders to...

Transformer internals: massive activations, attention sinks, and an architecture fix in the wild

Digest Calendar

Recent Posts

Perception Adaptation: Dynamic Scenes and Sensor Conversion

Multimodal LLMs: Capability Gains Meet Bias Risks

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

AI Hacking Models Jolting Washington

What to know about the AI models that are jolting Washington

FlowLong Generates Long Videos at Inference Time

FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching

Heterogeneous Ensembles Signal End of Monolithic Models

Dual Tracks to Cheaper LLM Inference

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving

RL Sharpens LLM Agents for Real Spreadsheet Work

Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

AI Daily Brief · Mar 19 Daily Digest

Efficient Language Models

LLM Agents Tackle Post-Training and Heavy Research Automation

Industry Shift: Beyond Fine-Tuning to Early Data Mixing for Domain Efficiency

Scaling Robot Autonomy: Safety Guarantees and Data Innovations Beyond Teleop

Mamba-3: Key SSM Advances for Transformer-Beating Efficiency

HorizonMath: Auto-Verifying LLMs Toward Math Discovery

FinToolBench: New Benchmark for LLM Agents in Finance

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

Entropy-Aware Decoding Tackles Hallucinations in Multimodal Models

Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

Masked Modeling Streamlines Image Pre-Training for UMMs

Rethinking UMM Visual Generation: Masked Modeling for Efficient Image-Only Pre-training

Attention Residuals: Fixing Deep LM Depth Limits with Selective Aggregation

AI Daily Brief · Mar 18 Daily Digest

Transformer Architecture Innovations

Why AI Lacks Autonomous Learning: Cognitive Science View

Why AI systems don't learn – On autonomous learning from cognitive science

VLMs Fight Compute Asymmetry: Penguin-VL and Transfusion Lead Efficiency Charge

Reading Activity