AI Frontier Digest

5h ago

Domain Agents Scale on Real-World Data

Robotics leap: Xiaomi-Robotics-1, a VLA model, pre-trains on 100K+ hours of real trajectories with auto-labeling, then aligns via post-training for...

Xiaomi-Robotics-1: Scaling Vision-Language-Action Models with over 100K Hours of Real-World Trajectories

arxiv.org

Xiaomi-Robotics-1: Scaling Vision-Language-Action Models with over 100K Hours of Real-World Trajectories

5h ago

RecGPT-V3 Deploys Stateful Multi-Agent Reasoning at Taobao Scale

RecGPT-V3 brings production-grade multi-agent recommender systems to Taobao's "Guess What You Like" feed by solving three core scaling bottlenecks.

-...

arxiv.org

RecGPT-V3 Technical Report

5h ago

Multimodal AI Broadens to Science, Video Gen, and Long AV

Three fresh models highlight multimodal expansion across scientific reasoning, video generation, and long-form audio-visual understanding, with...

S1-Omni: A Unified Multimodal Reasoning Model for Scientific Understanding, Prediction, and Generation

arxiv.org

S1-Omni: A Unified Multimodal Reasoning Model for Scientific Understanding, Prediction, and Generation

5h ago

World Models + Skill Wikis Cut Agent Trial-and-Error Costs

Two papers show agents shifting from expensive trial-and-error to anticipatory, skill-based systems:

DSWorld predicts data-science operation...

DSWorld: A Data Science World Model for Efficient Autonomous Agents

arxiv.org

DSWorld: A Data Science World Model for Efficient Autonomous Agents

5h ago

RAGU Delivers Modular Multi-Step GraphRAG with Compact Domain LLM

RAGU tackles noisy single-pass GraphRAG via explicit separation of extraction from consolidation.

Two-stage typed extraction, DBSCAN deduplication,...

RAGU: A Multi-Step GraphRAG Engine with a Compact Domain-Adapted LLM

arxiv.org

RAGU: A Multi-Step GraphRAG Engine with a Compact Domain-Adapted LLM

5h ago

Frontier Labs Split on Model Paradigms

Two notable releases highlight diverging paths in frontier model design:

Thinking Machines' Inkling (952B MoE) ships as open-weight native...

5h ago

Pretraining Shapes RL Reasoning Returns

Pretraining loss strongly predicts post-RL performance on reasoning tasks, while RL reward curve slopes improve linearly with pretraining tokens. This...

Understanding Reasoning from Pretraining to Post-Training

arxiv.org

Understanding Reasoning from Pretraining to Post-Training

5h ago

16h ago

AI Frontier Digest · Jul 20, 2026 Daily Digest

Frontier Model Releases

🔥 Kimi K3 Open-Weight Release: Moonshot AI released Kimi K3, a 2.8-trillion-parameter MoE transformer model with 50B...

China's Kimi K3 Triggers an AI Sputnik Moment, 2.8 Trillion ...

finance.biggo.com

China's Kimi K3 Triggers an AI Sputnik Moment, 2.8 Trillion ...

1d ago

Kimi K3 Exposes the Geopolitics of Open Weights

Kimi K3's 2.8T open-weight release marks an "AI Sputnik moment," delivering frontier performance at 1% of US costs despite older chips and export...

The Politics of Weights. Why AI is normative infrastructure, and… | by Adnan Masood, PhD. | Jul, 2026 | Medium

medium.com

The Politics of Weights. Why AI is normative infrastructure, and… | by Adnan Masood, PhD. | Jul, 2026 | Medium

1d ago

RISE: Representation Ensemble for Deeper LLM Collaboration

RISE lets LLMs share internal representations instead of just final outputs, using relational alignment and orthogonal transforms to handle layer and...

Deep Collaboration between Large Language Models via ...

researchgate.net

Deep Collaboration between Large Language Models via ...

1d ago

AI Frontier Digest · Jul 19 Daily Digest

Inference Efficiency Breakthroughs

🔥 Byte-Exact KV-Cache Grafting: Byte-exact KV-cache grafting enables a frozen Gemma-4-12B model to reach...

Smarter and Cheaper at Once: Byte-Exact KV-Cache Grafting Turns a Frozen Small Model into a Verified-Knowledge Flywheel

arxiv.org

Smarter and Cheaper at Once: Byte-Exact KV-Cache Grafting Turns a Frozen Small Model into a Verified-Knowledge Flywheel

2d ago

GPT-5.6 Closes 30-Year Convex Optimization Gap

GPT-5.6 closed a 30-year open problem in convex optimization via a single, specialized prompt. The 10-page prompt drew on a year of prior human research, underscoring frontier models' emerging role in tackling longstanding mathematical challenges.

GPT-5.6 used a prompt to close a 30-year gap in convex optimization

news.ycombinator.com

GPT-5.6 used a prompt to close a 30-year gap in convex optimization

2d ago

KV-Cache Grafting Supercharges Frozen Small Models

Byte-exact KV-cache grafting turns a frozen 12B model into a verified-knowledge engine: on AIME 2025 it jumps from 80.0% to 93.3% (surpassing its 31B...

arxiv.org

Smarter and Cheaper at Once: Byte-Exact KV-Cache Grafting Turns a Frozen Small Model into a Verified-Knowledge Flywheel

2d ago

Flawed Protocols Undermine Harness Evolution Claims

Automatic harness evolution for LLM agents shows no consistent gains over simple test-time scaling and limited generalization to held-out tasks. The...

Rethinking the Evaluation of Harness Evolution for Agents

arxiv.org

Rethinking the Evaluation of Harness Evolution for Agents

2d ago

GRASP: Adaptive Granularity for Agentic RAG

GRASP trains agents with RL to dynamically coordinate semantic search, keyword search, and paragraph reading, retrieving sentence-level evidence only...