TARGET-SFT vs On-Policy Distillation: Contrasting LLM Post-Training Paths
Two post-training methods target LLM alignment and efficiency through distinct routes.
- TARGET-SFT is positioned as a fine-tuning breakthrough for AI...

Created by Hydrangea10
Frontier AI research news on LLM architectures, training methods, and theory
Explore the latest content tracked by Frontier AI Insights
Two post-training methods target LLM alignment and efficiency through distinct routes.
Modern LLMs like Qwen3-Next, Kimi Linear, and Ling 2.5 are shifting to hybrid layers that replace most full attention with cheaper linear or recurrent...
World models that pretrain for imagination then fine-tune for action represent a rising direction in embodied AI and reinforcement learning....
Formal mathematics is emerging as a key fix for reward hacking in LLM reasoning.
Daniel Barzilai connects longstanding learning theory to today's frontier concerns in two parts.
Google's DiffusionGemma introduces a non-autoregressive approach to text generation that contrasts sharply with standard left-to-right token...
A discussion on finding optimal tokenizers for LLMs has drawn 27 points on Hacker News, underscoring interest in tokenizer design's role in model performance.
Grammar-constrained decoding enables LLMs to generate malicious code, exposing a critical safety vulnerability in existing alignment techniques. The...
Systematic comparisons of large language models reveal that capabilities evolve quickly with each new release, rendering conclusions potentially outdated soon after publication.
Switchable latent recurrence uses a visible-to-latent curriculum and Switch-GRPO objective to propagate gradients through recurrent latent computation, enabling efficient hidden-state reasoning in frontier models.
HarnessBridge presents a learnable bidirectional controller that strengthens LLM agent performance by improving dynamic interactions with complex environments. This design directly tackles bridging challenges between agents and their surroundings.
A clear trend is emerging: LLM agents are moving beyond static setups toward autonomous co-evolution of policies, training harnesses, and...
Recent papers spotlight complementary weaknesses in current designs:
CPPO refines token-level trust regions by dropping uniform divergence thresholds that ignore how early errors cascade.
It applies position-weighted...
New research shows one-shot GRPO training can compromise LLM safety alignments, underscoring vulnerabilities in current frontier models. The ArXiv digest also flags 10 emerging trends spanning alignment, RL, and multimodal methods.
A new study highlights the potential of continuous diffusion models for categorical data in outperforming large language models in text-to-speech tasks.
Yann LeCun's new paper demands the AI industry abandon its core AGI obsession, opening with the provocative claim that even Magnus Carlsen isn't good at chess.