Home Explore Pricing Blog Docs New Tracker

Get the App

•

AI Research Radar - NBot Tracker | nbot.ai

AI Research Radar

Created by Margaret Milum

411 posts

Updated 71 days ago

0 scanned

Daily AI research papers, safety analyses, and industry reports

Create Similar Tracker

Digest Calendar

May 2026

Sun

Mon

Tue

Wed

Thu

Fri

Sat

New Agent Evaluation Benchmarks

SWE-Skills-Bench: SWE-Skills-Bench questions whether agent skills help in real-world software engineering...

March 18, 2026

Beyond Teleop: ICRA2026 Workshop on Scaling Robot Data

Teleop data is expensive and hard to scale—enter simulation 🖥️, human videos 🎥, AC-WMs 🤖, and WAMs 🌍.
#ICRA2026 workshop tackles how these...

March 18, 2026

InCoder-32B: Specialized Code LLM for Chip Design Efficiency

InCoder-32B advances industrial AI with a 32B-parameter model for chip design and GPU kernel optimization.

Key breakthroughs:

Bridges general LLMs...

March 18, 2026

Trend: Secure Distributed Frameworks Powering Autonomous AI Agents for Science

Emerging infra blends decentralized discovery with vulnerability safeguards:

ScienceClaw coordinates independent agents for scientific tasks via...

March 18, 2026

Pitfalls of Sub-Sampling in Expensive LLM Evals

LLM evals are expensive, so sub-sampling eval data is common to cut costs and boost efficiency in tuning/ablation experiments.
Wrong sampling approaches lead to noisy or incorrect results.
Prioritize rank correlation for reliable subsampling.

March 18, 2026

Capy's Captain & Build Agents: Decomposing Planning for Efficiency

Agentic split: Most AI tools merge planning and execution; Capy uses specialized captain (planning) and build (execution) agents.
Quality edge:...

March 18, 2026

WorldCam: Video Diffusion Transformers for Precise 3D Gaming Control

WorldCam enhances interactive 3D gaming with video diffusion transformers augmented by camera pose representation, enabling precise action control and long-term 3D consistency.

Paper page - WorldCam: Interactive Autoregressive 3D Gaming Worlds ...

March 18, 2026·

huggingface.co

March 18, 2026

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

M^3 advances monocular SLAM:

Integrates dense matching with multi-view foundation models
Targets Gaussian splatting SLAM for multimodal 3D reconstruction
Join the discussion to explore sim-to-real implications.

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

arxiv.org

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

March 18, 2026

New Benchmarks Question LLM Agent Skills in Real-World Tasks

Emerging trend in AI research: Benchmarks exposing gaps in LLM agent performance.

FinToolBench evaluates agents on real-world financial tool use
-...

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

arxiv.org

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

March 18, 2026

Cognitive Framework for AGI Progress Measurement

New paper outlines a cognitive framework for measuring AGI progress, quickly gaining traction with 58 points on Hacker News. Explores milestones beyond pure scaling.

Measuring progress toward AGI: A cognitive framework

March 18, 2026·

news.ycombinator.com

March 18, 2026

Trend: Verification Powers Heavy-Duty Agentic Research and Eval

MiroThinker-1.7 & H1 targets heavy-duty research agents via verification.
One-Eval enables automated, traceable LLM evaluation with agentic systems.
Emerging pattern: Verification primitives boost reproducible agent reasoning for AI self-assessment.

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

arxiv.org

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

March 18, 2026

AI Research Radar · Mar 18 Daily Digest

Efficient Sequence Modeling

🔥 Mamba-3: Mamba-3 is a new state space model (SSM) by Together AI designed with inference efficiency as the...

Why AI systems don't learn – On autonomous learning from cognitive science

news.ycombinator.com

March 18, 2026

Cognitive Science Explains AI's Autonomous Learning Failures

AI systems don't truly learn autonomously, per this cognitive science analysis that's buzzing on Hacker News with 62 points. Essential reading for agentic AI progress.

Why AI systems don't learn – On autonomous learning from cognitive science

March 18, 2026·

news.ycombinator.com

March 18, 2026

HSImul3R: Physics-in-the-Loop for Simulation-Ready Human-Scene 3D Reconstruction

HSImul3R closes the gap with a novel simulation-ready Human–Scene Interaction 3D reconstruction framework, using physics-in-the-loop to formulate reconstruction as a bi-directional process. Key for agentic environments bridging perception to sim.

Physics-in-the-Loop Reconstruction of Simulation-Ready Human–Scene ...

March 18, 2026·

arxiv.org

March 18, 2026

Transformer Efficiency: Depth-Wise Residuals and MoDA Attention Trends

AttnRes by Kimi Team swaps fixed residuals for depth-wise ones, revolutionizing Transformer architecture.
MoDA lets each attention head access...

A Deep Dive into Attention Residuals | by ArXiv In-depth Analysis - Medium

March 18, 2026·

medium.com

March 18, 2026

Survey on Deep Generative Models for Tabular Data Synthesis

New survey explores deep generative modeling for tabular data:

Reviewed through five key requirements, starting with utility of synthetic data
-...

A Survey on Deep Learning Approaches for Tabular Data ...

March 18, 2026·

arxiv.org

March 18, 2026

Grokking as Variance-Limited Phase Transition via Spectral Gating

Grokking in neural networks is framed as a variance-limited phase transition driven by spectral gating, via tail-index analysis of stochastic gradient noise. Presented at ICML 2019—key mechanistic insight into training dynamics.

Grokking as a Variance-Limited Phase Transition: Spectral Gating ...

March 18, 2026·

arxiv.org

March 18, 2026

Novel ViT-LLM Model Advances Image Captioning

Breakthrough in multimodal captioning: New paper introduces a novel deep learning model for image captioning using an advanced vision transformer architecture with a powerful LLM.

Deep learning–driven image captioning: Progress through ... - PMC

March 18, 2026·

pmc.ncbi.nlm.nih.gov

March 17, 2026

TiDAR & Mamba-3: Shattering AR Inference Speed Limits

Key inference efficiency breakthroughs:

TiDAR hybrids diffusion drafting with AR verification for 6x speedup, bypassing memory bottlenecks in...

March 17, 2026

Consensus in Disturbed Multi-Agent Systems

New research tackles the consensus problem in multi-agent systems (MAS) subject to external disturbances. Key focus: strategies based on dynamical approaches for robust coordination.

Consensus of multi-agent system with disturbance based on dynamical ...

March 17, 2026·

aimspress.com

AI Research Radar

Digest Calendar

Recent Posts

AI Research Radar · Mar 19 Daily Digest

New Agent Evaluation Benchmarks

Beyond Teleop: ICRA2026 Workshop on Scaling Robot Data

InCoder-32B: Specialized Code LLM for Chip Design Efficiency

Trend: Secure Distributed Frameworks Powering Autonomous AI Agents for Science

Pitfalls of Sub-Sampling in Expensive LLM Evals

Capy's Captain & Build Agents: Decomposing Planning for Efficiency

WorldCam: Video Diffusion Transformers for Precise 3D Gaming Control

Paper page - WorldCam: Interactive Autoregressive 3D Gaming Worlds ...

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

New Benchmarks Question LLM Agent Skills in Real-World Tasks

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

Cognitive Framework for AGI Progress Measurement

Measuring progress toward AGI: A cognitive framework

Trend: Verification Powers Heavy-Duty Agentic Research and Eval

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

AI Research Radar · Mar 18 Daily Digest

Efficient Sequence Modeling

Why AI systems don't learn – On autonomous learning from cognitive science

Cognitive Science Explains AI's Autonomous Learning Failures

Why AI systems don't learn – On autonomous learning from cognitive science

HSImul3R: Physics-in-the-Loop for Simulation-Ready Human-Scene 3D Reconstruction

Physics-in-the-Loop Reconstruction of Simulation-Ready Human–Scene ...

Transformer Efficiency: Depth-Wise Residuals and MoDA Attention Trends

A Deep Dive into Attention Residuals | by ArXiv In-depth Analysis - Medium

Survey on Deep Generative Models for Tabular Data Synthesis

A Survey on Deep Learning Approaches for Tabular Data ...

Grokking as Variance-Limited Phase Transition via Spectral Gating

Grokking as a Variance-Limited Phase Transition: Spectral Gating ...

Novel ViT-LLM Model Advances Image Captioning

Deep learning–driven image captioning: Progress through ... - PMC

TiDAR & Mamba-3: Shattering AR Inference Speed Limits

Consensus in Disturbed Multi-Agent Systems

Consensus of multi-agent system with disturbance based on dynamical ...

Reading Activity