AI Paper Tracker

May 7, 2026

AI Paper Tracker · May 7 Daily Digest

New Agent Benchmarks

🔥 DataClaw: DataClaw employs a three-stage human-in-the-loop annotation pipeline of expert task design, human-AI...

May 6, 2026

Surge in Specialized Benchmarks for AI Agents

Domain-specific benchmarks proliferate for complex agent tasks:

Workspace-Bench 1.0: AI agents on workspace learning with large-scale file...

[2605.03596] Workspace-Bench 1.0: Benchmarking AI Agents on ... - arXiv

May 6, 2026·

arxiv.org

May 6, 2026

Equivariance Matters More at Scale in Neural Force Fields

Challenging the growing mindset, this empirical study on neural force fields shows equivariance matters even more as models scale, with clear evidence reported.

Scaling Laws and Symmetry, Evidence from Neural Force Fields

May 6, 2026·

arxiv.org

May 6, 2026

Trend: Efficiency Boosts in Flow and Diffusion Generative Models

Efficiency breakthroughs trend in generative modeling via flows and stochasticity:

Self-Flow (ICML-2026): Self-supervised flow matching integrates...

May 6, 2026

Overlooked Agentic AI Risks: Red Teaming, Deskilling, and Unverified Skills

Agentic AI vulnerabilities are expanding rapidly:

Red teaming must accelerate from weeks to hours as AI enters healthcare, finance, and defense amid...

May 6, 2026

Pre-Alignment via Black-Box On-Policy Distillation for Multimodal RL

Beyond SFT-to-RL: This paper introduces pre-alignment via black-box on-policy distillation to bootstrap multimodal RL policies.

arxiv.org

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

May 6, 2026

LLMs as Universal Graph Predictors: New Systematic Evaluation

Fresh arXiv paper systematically evaluates LLMs' graph token understanding, fueled by their success motivating adaptation as universal predictors for graph tasks—pivotal for GNN-LLM synergy.

A Systematic Evaluation of Graph Token Understanding

May 6, 2026·

arxiv.org

May 6, 2026

Iterated Learning Mirrors Child Language Acquisition for Compositional AI

Compositionality and systematicity emerge from iterated learning, mimicking how children rapidly acquire language early in development. This theory shows language evolves through iterative processes, offering a path to foster compositional AI.

Compositionality and systematicity emerge from iterated learning in ...

May 6, 2026·

pnas.org

May 6, 2026

SplAttN: New 2D-3D Bridge for Point Cloud Completion

SplAttN leverages Gaussian soft splatting and attention to bridge 2D and 3D modalities for point cloud completion. A fresh arXiv upload pushing efficient reconstruction frontiers.

SplAttN: Bridging 2D and 3D with Gaussian Soft Splatting and Attention for Point Cloud Completion

arxiv.org

SplAttN: Bridging 2D and 3D with Gaussian Soft Splatting and Attention for Point Cloud Completion

May 6, 2026

APEX: First Large-Scale MTL for AI-Generated Music Popularity

APEX is the first large-scale multi-task learning framework for AI-generated music, trained on over 211k songs (10k hours of audio) from Suno and Udio to predict popularity with aesthetic insights.

[2605.03395] APEX: Large-scale Multi-task Aesthetic-Informed Popularity ...

May 6, 2026·

arxiv.org

May 6, 2026

HeteroSense-FL: Python Toolkit for Multimodal Heterogeneous FL Simulation

HeteroSense-FL is a new Python software package for structured multimodal sensor simulation targeting modality-heterogeneous federated learning (FL) research. Bridges simulation gaps in heterogeneous FL studies.

HeteroSense-FL: A multimodal simulation testbed for modality ...

May 6, 2026·

sciencedirect.com

May 6, 2026

DocETL: Declarative Agentic Fix for LLM Document Pipelines

DocETL optimizes complex document processing pipelines by addressing LLM shortcomings through a declarative interface for agentic query rewriting and evaluation.

DocETL: Agentic Query Rewriting and Evaluation for Complex Document ...

May 6, 2026·

dl.acm.org

May 6, 2026

AI Paper Tracker · May 6 Daily Digest

NeurIPS 2026 Reproducibility Track

🔥 MLRC 2026: MLRC 2026 accepted reproducibility papers will be presented in person at NeurIPS 2026 in...

May 5, 2026

Diffusion Priors vs. Motion Caching: Boosting Scalable Video Synthesis

UniVidX introduces a unified multimodal framework via diffusion priors for versatile video generation
Motion-Aware Caching enhances efficiency...

May 5, 2026

Compound AI Agent Unifies Grant Discovery

Fresh arXiv paper introduces a compound AI system for conversational grant discovery:

Aggregation layer autonomously collects data to unify the...

[2605.02366] A Compound AI Agent for Conversational Grant Discovery

May 5, 2026·

arxiv.org

May 5, 2026

AI Agency Parallels Human Frontal Lobe Development

New arXiv paper [2605.02810] compares human agency, which takes many years to develop as the frontal lobe activates, with potential agency in AI programs. A philosophical lens on AI's developmental path.

[2605.02810] AIs and Humans with Agency

May 5, 2026·

arxiv.org

May 5, 2026

LLMs Introduce Technical Debt in AI-Generated Code

New paper audits AI-generated software flaws:

Title: AI-Generated Smells: An Analysis of Code and Architecture in LLM
Core insight: Systematic...

AI-Generated Smells: An Analysis of Code and Architecture in LLM and ...

May 5, 2026·

arxiv.org

May 5, 2026

T^2PO: New Uncertainty-Guided Method for Stable Multi-Turn Agentic RL

T^2PO introduces uncertainty-guided exploration control to achieve stable multi-turn agentic reinforcement learning. A fresh arXiv upload tackling key stability challenges in agentic RL.