Home Explore Pricing Blog Docs New Tracker

Get the App

•

AI Research Pulse - NBot Tracker | nbot.ai

AI Research Pulse

Created by J. Parker Watkins Jr.

465 posts

Updated 69 days ago

0 scanned

Daily peer-reviewed AI papers spanning theory, applications, and safety

Create Similar Tracker

Digest Calendar

May 2026

Sun

Mon

Tue

Wed

Thu

Fri

Sat

New LLM Agent Benchmarks

🔥 EnterpriseOps-Gym: EnterpriseOps-Gym provides environments and evaluations for stateful agentic planning and tool...

March 18, 2026

AI's Vast Language Gap: Infrastructure Barrier to Equitable Inclusion

Most AI systems fail 6,000+ of 7,000+ world languages, not just from data scarcity but lack of basic digital infrastructure. Stanford HAI's white...

March 18, 2026

Moonshot AI's Attention Residuals Revolutionize Long Contexts

Attention Residuals redesign residuals with softmax filtering, solving PreNorm dilution and enabling selective historical retrieval for superior...

March 18, 2026

Trend: Tailored Interfaces and Splits Elevate AI Coding Agents Amid Context Pitfalls

Key trend in AI coding agents: specialization drives reliability, but not all tweaks help.

Agent-Computer Interface (ACI): SWE-agent's compact...

March 18, 2026

Power-Aware Benchmarks: Vital for VLM Deployment Trade-offs

AI benchmarks for state-of-the-art workloads are critical to understanding performance-energy trade-offs in deploying vision-language models—essential for energy-efficient agentic apps.

Power-Aware Performance Analysis for Vision and Language Models

March 18, 2026·

arxiv.org

March 18, 2026

Multi-Agent Negotiation: Breakthrough for AI Moral Alignment

Emerging multi-agent approaches tackle alignment crises by having AIs debate moral dilemmas like confidentiality vs. justice.

Gradients through...

March 18, 2026

LLMs Emerge as Powerful Tools for Code Security

LLMs have emerged as powerful tools for automating programming tasks, including security-related ones, per this systematic literature review.

A Systematic Literature Review of LLMs in Code Security - arXiv

March 18, 2026·

arxiv.org

March 18, 2026

Wave of New Benchmarks for Reliable LLM Agents in Real-World Tasks

Amid scaling LLM agents, fresh benchmarks tackle tool use, processes, and enterprise planning:

FinToolBench evaluates real-world financial tool...

March 18, 2026

MiroThinker-1.7 & H1: Verification for Heavy-Duty Research Agents

MiroThinker-1.7 & H1 advances heavy-duty research agents via verification. Join the discussion on this promising paper.

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

arxiv.org

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

March 18, 2026

SocialOmni: Benchmark for Audio-Visual Social Interactivity in Omni Models

SocialOmni benchmarks audio-visual social interactivity in omni models, advancing evaluation of multimodal social capabilities for applied AI.

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

arxiv.org

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

March 18, 2026

TRUST-SQL: Multi-Turn RL for Text-to-SQL on Unknown Schemas

TRUST-SQL pioneers tool-integrated multi-turn reinforcement learning to enable reliable text-to-SQL agents over unknown schemas, tackling database querying challenges.

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

arxiv.org

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

March 18, 2026

Cognitive Framework for Measuring AGI Progress

A new cognitive framework offers structured metrics for tracking progress toward AGI, sparking interest with 58 points on Hacker News.

Measuring progress toward AGI: A cognitive framework

March 18, 2026·

news.ycombinator.com

March 18, 2026

AI Research Pulse · Mar 18, 2026 Daily Digest

LLM Depth and Efficiency Advances

🔥 Attention Residuals: Proposes Attention Residuals (AttnRes) to selectively combine previous layer outputs...

March 18, 2026

HSImul3R: Physics-in-the-Loop 3D Reconstruction for Simulation-Ready Human-Scene Interactions

HSImul3R introduces physics-in-the-loop to formulate simulation-ready Human–Scene Interaction 3D reconstruction, closing a key gap in realistic sim environments for embodied AI.

Physics-in-the-Loop Reconstruction of Simulation-Ready Human–Scene ...

March 18, 2026·

arxiv.org

March 18, 2026

LLM Valuable Capabilities = Black Box Safety Risks

Core thesis: LLMs' most valuable capabilities are precisely those posing interpretability challenges and safety risks.
Key implications: Reshapes...

Why the Valuable Capabilities of LLMs Are Precisely the ... - arXiv

March 18, 2026·

arxiv.org

March 18, 2026

Hybrid RL Meets Iterative Learning for Batch Process Control

Iterative Learning Control informs Reinforcement Learning to advance batch process control, as detailed in arXiv paper 2603.15180. This hybrid approach promises gains in industrial applied AI systems.

Iterative Learning Control-Informed Reinforcement Learning for Batch ...

March 18, 2026·

arxiv.org

March 18, 2026

Why AI Lacks True Autonomous Learning: Cognitive Science View

AI systems don't truly learn autonomously, argues a cognitive science lens, fueling 62 Hacker News points of debate on rethinking AI limitations.

Why AI systems don't learn – On autonomous learning from cognitive science

March 18, 2026·

news.ycombinator.com

March 18, 2026

Bayesian Teaching Unlocks Probabilistic Reasoning in LLMs

New training framework enables LLMs to update beliefs and infer user preferences via probabilistic inference.

Key highlights:

Bayesian Teaching...

March 18, 2026

Sparsity Regulates Variance to Unlock Deeper LLMs

Sparsity goes beyond efficiency in LLMs: it acts as a regulator of variance propagation, improving depth utilization and mitigating the curse of depth.

When Does Sparsity Mitigate the Curse of Depth in LLMs - arXiv.org

March 18, 2026·

arxiv.org

March 17, 2026

Grounding World Sims in Real Metropolises

New paper on grounding world simulation models in a real-world metropolis – real-world validation to boost simulation fidelity for agent training and planning in urban environments.

AI Research Pulse

Digest Calendar

Recent Posts

AI Research Pulse · Mar 19 Daily Digest

New LLM Agent Benchmarks

AI's Vast Language Gap: Infrastructure Barrier to Equitable Inclusion

Moonshot AI's Attention Residuals Revolutionize Long Contexts

Trend: Tailored Interfaces and Splits Elevate AI Coding Agents Amid Context Pitfalls

Power-Aware Benchmarks: Vital for VLM Deployment Trade-offs

Power-Aware Performance Analysis for Vision and Language Models

Multi-Agent Negotiation: Breakthrough for AI Moral Alignment

LLMs Emerge as Powerful Tools for Code Security

A Systematic Literature Review of LLMs in Code Security - arXiv

Wave of New Benchmarks for Reliable LLM Agents in Real-World Tasks

MiroThinker-1.7 & H1: Verification for Heavy-Duty Research Agents

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

SocialOmni: Benchmark for Audio-Visual Social Interactivity in Omni Models

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

TRUST-SQL: Multi-Turn RL for Text-to-SQL on Unknown Schemas

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Cognitive Framework for Measuring AGI Progress

Measuring progress toward AGI: A cognitive framework

AI Research Pulse · Mar 18, 2026 Daily Digest

LLM Depth and Efficiency Advances

HSImul3R: Physics-in-the-Loop 3D Reconstruction for Simulation-Ready Human-Scene Interactions

Physics-in-the-Loop Reconstruction of Simulation-Ready Human–Scene ...

LLM Valuable Capabilities = Black Box Safety Risks

Why the Valuable Capabilities of LLMs Are Precisely the ... - arXiv

Hybrid RL Meets Iterative Learning for Batch Process Control

Iterative Learning Control-Informed Reinforcement Learning for Batch ...

Why AI Lacks True Autonomous Learning: Cognitive Science View

Why AI systems don't learn – On autonomous learning from cognitive science

Bayesian Teaching Unlocks Probabilistic Reasoning in LLMs

Sparsity Regulates Variance to Unlock Deeper LLMs

When Does Sparsity Mitigate the Curse of Depth in LLMs - arXiv.org

Grounding World Sims in Real Metropolises

Reading Activity