Home Explore Pricing Blog Docs New Tracker

Get the App

•

AI Research Daily Digest - NBot Tracker | nbot.ai

AI Research Daily Digest

Created by xu ji

751 posts

Updated 69 days ago

50 followers

0 scanned

Daily AI research digest of papers and pre‑prints, no fluff

Create Similar Tracker

Digest Calendar

May 2026

Sun

Mon

Tue

Wed

Thu

Fri

Sat

3D Vision Advances

🔥 SegviGen: Repurposes 3D generative model for part segmentation.
🔥 M^3: Combines dense matching with multi-view...

SegviGen: Repurposing 3D Generative Model for Part Segmentation

arxiv.org

SegviGen: Repurposing 3D Generative Model for Part Segmentation

March 18, 2026

SegviGen Repurposes 3D Generative Models for Part Segmentation

SegviGen innovatively repurposes 3D generative models for part segmentation. Join the discussion on this paper.

arxiv.org

SegviGen: Repurposing 3D Generative Model for Part Segmentation

March 18, 2026

M^3: Dense Matching and Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

M^3 integrates dense matching with multi-view foundation models to advance monocular Gaussian Splatting SLAM, a key step for real-time 3D reconstruction in robotics and AR/VR.

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

arxiv.org

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

March 18, 2026

FinToolBench: Benchmark for LLM Agents in Financial Tool Use

FinToolBench proposes a new benchmark to evaluate LLM agents on real-world financial tool use, standardizing assessments for high-stakes finance tasks.

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

arxiv.org

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

March 18, 2026

TRUST-SQL: Tool-Integrated RL for Text-to-SQL on Unknown Schemas

TRUST-SQL proposes tool-integrated multi-turn reinforcement learning to enable Text-to-SQL querying over unknown schemas, tackling LLM agent reliability in schema uncertainty.

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

arxiv.org

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

March 18, 2026

AI Research Daily Digest · Mar 18, 2026

New Benchmarks

🔥 VAREX: VAREX presents Reverse Annotation, a pipeline for generating synthetic document extraction benchmarks with...

OpenSeeker: Democratizing Frontier Search Agents by Fully Open ...

arxiv.org

March 18, 2026

VAREX: Synthetic Benchmark with Auditable Ground Truth for Document Extraction

Researchers introduce VAREX, a benchmark using the Reverse Annotation pipeline to generate synthetic datasets for multi-modal structured extraction from documents, featuring deterministic value-level ground truth and auditable schemas.

VAREX: A Benchmark for Multi-Modal Structured Extraction from Documents

March 18, 2026·

arxiv.org

March 18, 2026

OpenSeeker: First Fully Open-Source Frontier Search Agent

OpenSeeker is the first fully open-source search agent (model and data) achieving frontier-level performance, democratizing advanced capabilities.

OpenSeeker: Democratizing Frontier Search Agents by Fully Open ...

March 18, 2026·

arxiv.org

March 17, 2026

Attention Residuals: Paper and MoonshotAI Repo for Reproducible LLM Gains

Attention Residuals paper highlights limits of standard PreNorm residuals, which accumulate LLM layer outputs with fixed unit weights.
Proposes...

MoonshotAI/Attention-Residuals

March 17, 2026·

github.com

March 17, 2026

SFT vs RL: New Study on LLM Post-Training Methods

New paper compares Supervised Fine-Tuning versus Reinforcement Learning as post-training methods for Large Language Models. Join the discussion on this research.

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

arxiv.org

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

March 17, 2026

HSImul3R: Physics-in-the-Loop Reconstruction for Simulation-Ready Human-Scene Interactions

HSImul3R advances physics-aware reconstruction of human-scene interactions, enabling simulation-ready outputs through a physics-in-the-loop approach.

HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions

arxiv.org

HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions

March 17, 2026

MMOU: New Massive Multi-Task Benchmark for Long Video Reasoning

MMOU introduces a massive multi-task benchmark for omni understanding and reasoning on long and complex real-world videos, advancing multimodal evaluation.

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

arxiv.org

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

March 17, 2026

AI Research Daily Digest · Mar 17 Daily Digest

Benchmark Datasets

🔥 OpenPlant: OpenPlant is a large-scale benchmark dataset for agricultural plant research using deep learning.
InterPose:...

March 17, 2026

Vision-Language Models Tackle the Shell Game

New paper asks: Can Vision-Language Models solve the shell game? – a benchmark probing object tracking under dynamic occlusion.

March 16, 2026

OpenPlant: Benchmark for Robust Plant ID via VLM-CNN-ViT Fusion

OpenPlant launches a large-scale benchmark dataset for agricultural plant identification.

Multimodal synergy: VLMs combined with CNNs and ViTs...

OpenPlant: A Large-Scale Benchmark Dataset for Agricultural Plant ... - PMC

March 16, 2026·

pmc.ncbi.nlm.nih.gov

March 16, 2026

First Speech-to-Sign Prototype for Kazakh Sign Language

Researchers introduce the first prototype speech-to-sign gesture translation system for Kazakh Sign Language (KRSL), featuring a new dataset and integrated pipeline to advance multimodal accessibility.

Speech-to-Sign Gesture Translation for Kazakh: Dataset and ...

March 16, 2026·

mdpi.com

March 16, 2026

ComFree-Sim: 2-3x Throughput Gain Over MuJoCo Warp in Dense Contacts

ComFree-Sim excels in contact-rich robotics:

2-3x higher throughput than MuJoCo Warp in dense scenarios with comparable, tunable fidelity
Boosts...

March 16, 2026

InterPose: Large-Scale Automated Dataset for 3D Human-Object Interactions

Researchers propose InterPose, a large-scale and automatically created dataset of 3D human motions featuring diverse human-object interactions from web data.

Learning to Generate Human-Object Interactions from Large-Scale Web ...

March 16, 2026·

openreview.net

March 16, 2026

AI Research Daily Digest · Mar 16 Daily Digest

Multimodal Benchmarks

🔥 MMMU: Multi-discipline Multimodal Understanding on MMMU evaluates model performance on the MMMU benchmark, assessing...

March 16, 2026

Trend: Decoupling Details for Unified Multimodal Architectures

Cheers decouples patch details from semantic representations, enabling unified multimodal comprehension and generation
Multimodal OCR enables parsing anything from documents
Signals efficient decoupling trend for versatile multimodal processing

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

arxiv.org

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

March 16, 2026

AI Research Daily Digest

Digest Calendar

Recent Posts

AI Research Daily Digest · Mar 19 Daily Digest

3D Vision Advances

SegviGen: Repurposing 3D Generative Model for Part Segmentation

SegviGen Repurposes 3D Generative Models for Part Segmentation

SegviGen: Repurposing 3D Generative Model for Part Segmentation

M^3: Dense Matching and Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM

FinToolBench: Benchmark for LLM Agents in Financial Tool Use

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

TRUST-SQL: Tool-Integrated RL for Text-to-SQL on Unknown Schemas

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

AI Research Daily Digest · Mar 18, 2026

New Benchmarks

OpenSeeker: Democratizing Frontier Search Agents by Fully Open ...

VAREX: Synthetic Benchmark with Auditable Ground Truth for Document Extraction

VAREX: A Benchmark for Multi-Modal Structured Extraction from Documents

OpenSeeker: First Fully Open-Source Frontier Search Agent

OpenSeeker: Democratizing Frontier Search Agents by Fully Open ...

Attention Residuals: Paper and MoonshotAI Repo for Reproducible LLM Gains

MoonshotAI/Attention-Residuals

SFT vs RL: New Study on LLM Post-Training Methods

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

HSImul3R: Physics-in-the-Loop Reconstruction for Simulation-Ready Human-Scene Interactions

HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions

MMOU: New Massive Multi-Task Benchmark for Long Video Reasoning

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

AI Research Daily Digest · Mar 17 Daily Digest

Benchmark Datasets

Vision-Language Models Tackle the Shell Game

OpenPlant: Benchmark for Robust Plant ID via VLM-CNN-ViT Fusion

OpenPlant: A Large-Scale Benchmark Dataset for Agricultural Plant ... - PMC

First Speech-to-Sign Prototype for Kazakh Sign Language

Speech-to-Sign Gesture Translation for Kazakh: Dataset and ...

ComFree-Sim: 2-3x Throughput Gain Over MuJoCo Warp in Dense Contacts

InterPose: Large-Scale Automated Dataset for 3D Human-Object Interactions

Learning to Generate Human-Object Interactions from Large-Scale Web ...

AI Research Daily Digest · Mar 16 Daily Digest

Multimodal Benchmarks

Trend: Decoupling Details for Unified Multimodal Architectures

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation