Advances in Multi-Agent Systems and Agent Reliability
Key Questions
What research improves multi-agent system reliability?
StreamMA reduces latency via streaming communication, while TELBench and DRIFT enable error localization in long trajectories. MemTrain provides self-supervised memory training with significant gains on long-text tasks.
How does MMG2Skill advance agent capabilities?
MMG2Skill converts web guides into self-evolving skills, improving agent skill acquisition. Combined with new scaling laws like Effective Feedback Compute, these enhance harness efficiency and reliability.
What benchmarks test multimodal agent memory?
WorldMemArena evaluates multimodal agent memory performance. Related work like MapAgent explores industrial-scale frameworks for city-level map generation using agentic approaches.
A wave of research tackles key agent bottlenecks: StreamMA reduces latency and improves accuracy via streaming communication and a step-level scaling law; TELBench/DRIFT enables process-level error localization in long trajectories; MemTrain offers self-supervised memory training (17+ point gain on long-text QA); MMG2Skill converts web guides into self-evolving skills; and a new scaling law (Effective Feedback Compute) redefines agent harness efficiency. These collectively improve agent reliability, memory, and skill acquisition.