Reasoning depth, retrieval-aware inference and uncertainty-aware decoding [developing] [developing] [developing] [developing]

Key Questions

What is Cog-DRIFT?

Cog-DRIFT fixes RLVR stalls using scaffolding and zero-reward learning. It breaks exploration barriers in hard reasoning tasks.

What does ARC-AGI-3 or ViGoR evaluate?

ARC-AGI-3 and ViGoR-Bench test reasoning depth in visual and general models. They reveal limits in latent planning discovery.

What is the Depth Ceiling in LLMs?

The Depth Ceiling limits LLMs in discovering latent planning structures. Stanford/MIT studies show harnesses boost performance 6x with agents.

What patterns show in tool-integrated reasoning?

Tool-integrated reasoning reveals inefficiency patterns beyond accuracy. Studies decompose gains in multi-LLM pipelines.

What is the 70-page RM on mathematical objects?

Weston's 70-page paper covers reasoning over mathematical objects. It advances symbolic and structured reasoning benchmarks.

How does context affect LLM reasoning?

Context silently shortens LLM reasoning depth, as per Reasoning Shift. It impacts chain-of-thought faithfulness (87%/29%).

What is FIPO or LightThinker++?

FIPO and LightThinker++ address uncertainty-aware decoding and retrieval. They improve inference with scaffolding and symbolic aids.

Why emphasize symbolic approaches like Chollet/Marcus?

Chollet and Marcus advocate symbolic reasoning over curve-fitting. It counters limitations in pure scaling for deep reasoning.

Stanford/MIT Harnesses 6x/agent surveys; ARC-AGI-3/ViGoR; Cog-DRIFT RLVR stall fix via scaffolding; Erdős; Weston 70pg RM; CoT faith 87%/29%; FIPO/Qworld; LightThinker++; Chollet/Marcus symbolic; tool-integrated inefficiency patterns.

Sources (13)

Updated Apr 9, 2026

AI Research Roundup

Reasoning depth, retrieval-aware inference and uncertainty-aware decoding [developing] [developing] [developing] [developing]

Key Questions

What is Cog-DRIFT?

What does ARC-AGI-3 or ViGoR evaluate?

What is the Depth Ceiling in LLMs?

What patterns show in tool-integrated reasoning?

What is the 70-page RM on mathematical objects?

How does context affect LLM reasoning?

What is FIPO or LightThinker++?

Why emphasize symbolic approaches like Chollet/Marcus?

The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning

@EliasEskin: 🚨 Excited to share Cog-DRIFT, new work on enabling models to learn from zero-reward examples! RLVR...

Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning

@EliasEskin reposted: 🚨Excited to share Cog-DRIFT: When problems are too hard (pass@64=0), standard R...

@EliasEskin reposted: 🚨Cog-DRIFT: Breaking the Exploration Barrier in RLVR RLVR has pushed LLM reason...

@fchollet: With curve-fitting, you are recording a lossy approximation of the output of some generative program...

Reliable uncertainty estimates in deep learning with efficient Metropolis- ...

Your AI Filters Its Reasoning Before You See It

@jaseweston: 🧮 Reasoning over Mathematical Objects 🧮 Our 70-page(!) paper is out on arXiv, as covered by several...

@roydanroy: Gemini has been posting its solutions directly to https://t.co/fqfl9BoXzj. Everyone is still in the ...

ViGoR-Bench: Evaluating Reasoning in Visual Models

Reasoning Shift: How Context Silently Shortens LLM Reasoning

Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines