Theory-driven looks at AI capability, failure modes, and alignment

Probing AI’s Learning Limits

This cluster gathers theory and evaluation work dissecting how modern AI systems learn, reason, and fail. Papers probe hallucination reduction via entropy-aware decoding, amplification effects in test-time RL, optimal early-stopping for chain-of-thought, and saturation points where generative models stop meaningfully improving. Others analyze structural limits of message-passing networks, statistical learning in diffusion models, and why compression makes LMs value internal consistency over truth, alongside cognitive-science critiques of whether current systems truly ‘learn’ and whether alignment conflicts are even solvable. Together, they map the boundaries of today’s architectures and offer tools and concepts for safer, more reliably evaluated AI.

Sources (9)

Updated Mar 18, 2026

AI & ML Daily Digest

Theory-driven looks at AI capability, failure modes, and alignment

Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

Why AI systems don't learn – On autonomous learning from cognitive science

Amplification Effects in Test-Time Reinforcement Learning - arXiv

Are we there yet? Identifying saturation points in generative AI ...

TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning

Are Dilemmas and Conflicts in LLM Alignment Solvable? A ... - arXiv.org

Lost in Aggregation On a Fundamental Expressivity Limit of Message- ...

A theory of learning data statistics in diffusion models, from easy to hard

Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information