Agentic AI & Simulation

**Self-evol RL: Self-Execution/Cog-DRIFT/CORAL/Fabricate/StepSearch/A-Evolve/Glean/Hyperagents/Omni-SimpleMem/SKILL0/AutoAgent/Agent0/EvoAgentX/AutoKernel/Neuro-sym/Traj Sampling/On-Policy/Sieve-Gen/Hermes/GrandCode/Vero/autoresearch/ThinkTwice/Learning Retrieve/Paper Circle** [developing]

**Self-evol RL: Self-Execution/Cog-DRIFT/CORAL/Fabricate/StepSearch/A-Evolve/Glean/Hyperagents/Omni-SimpleMem/SKILL0/AutoAgent/Agent0/EvoAgentX/AutoKernel/Neuro-sym/Traj Sampling/On-Policy/Sieve-Gen/Hermes/GrandCode/Vero/autoresearch/ThinkTwice/Learning Retrieve/Paper Circle** [developing]

Key Questions

What is Cog-DRIFT and how does it work?

Cog-DRIFT enables RLVR on hard zero-reward problems using ZPD, adaptive simplification, MCQ/cloze. It breaks exploration barriers when rollouts fail (pass@64=0). Multiple posts highlight its innovation from Princeton researchers.

What improvements does Self-Execution provide?

Self-Execution uses simulation post-training to verify and fix coding in LLMs. It enhances reasoning LLMs for code generation. The paper shows it improves performance on coding tasks.

What is AutoAgent and its benchmarks?

AutoAgent ranks #1 on Spreadsheet and Terminal benchmarks. It leverages self-improvement techniques like Agent0/EvoAgentX. Part of OSS efforts in fast simulations and RL.

How does ThinkTwice optimize LLMs?

ThinkTwice jointly optimizes LLMs for reasoning and self-refinement. It enables joint reasoning improvements. The paper discusses its approach to test-time learning.

What is Paper Circle?

Paper Circle is an OSS multi-agent system for research discovery. It ties into autoresearch by Karpathy. It facilitates trajectory-based learning and retrieval.

What achievements does CORAL have?

CORAL uses multi-evolution, outperforming FunSearch 3-10x. It focuses on self-evolving RL techniques. Included in the highlight for simulation advancements.

What is GrandCode?

GrandCode achieves grandmaster level in competitive programming via agentic RL. It represents high-level autonomous coding. The paper page details its methodology.

What is AutoKernel?

AutoKernel is an open-source framework applying autonomous agent loops to GPU kernel optimization for PyTorch models. Released by RightNow AI. It aids in efficient model training.

Cog-DRIFT enables RLVR on hard zero-reward problems (pass@64=0) via ZPD, adaptive simplification, MCQ/cloze; Self-Execution sim post-train coding verify/fix; CORAL multi-evo >FunSearch 3-10x; AutoAgent #1 Spreadsheet/Terminal; Glean/Fabricate/StepSearch StePPO; Hyperagents metacog; Omni-SimpleMem +411%; SKILL0 ICRL +9.7%; ThinkTwice joint reasoning/self-refinement; Learning to Retrieve from trajectories; Paper Circle OSS multi-agent research; OSS self-improve Agent0/EvoAgentX/AutoKernel/Hermes; Neuro-sym long-horizon; IRAF/UI-Voyager/A-Evolve/CORECRAFT/STAR/KARL/Sakana/MemFactory; Karpathy AutoResearch; TRL v1.0 GRPO/FIPO; Vero visual RL; noisy sup. Urgent code/fidelity.

Sources (35)
Updated Apr 8, 2026