ETH Zurich Continual Learning Breakthrough
Key Questions
What breakthrough does the ETH Zurich work achieve?
Reverse-order learning combined with self-distillation (SDFT/SDPO) achieves 100% knowledge retention, eliminating catastrophic forgetting. This simple algorithm enables lifelong AI learning without resets. It is faster than GRPO.
How does this align with broader trends?
It aligns with LeCun's world-modeling and continual learning trends. Significant for agentic AI and post-training efficiency. Timely with ICML2026 CATS workshop CFP on continual adaptation.
Why is self-distillation important here?
Self-distillation is key to the method's success in retaining knowledge across tasks. It is taking over LLM post-training, as discussed in related videos with researchers. Enables continuous learning without performance drops.
Reverse-order learning + self-distillation (SDFT/SDPO) achieves 100% knowledge retention, crushing catastrophic forgetting—simple algo fix enables lifelong AI without resets, faster than GRPO. Aligns with LeCun world-modeling/continual trends; ICML2026 CATS workshop CFP timely; major for agentic/post-training efficiency.