********Theory of diffusion models and sampler design** [developing]
Key Questions
What one-step and trajectory-guided diffusion advances are noted?
Models like FlashMotion, V-Co, FlowScene, UniGRPO, VGGRPO, Diffutron (masked), and GenMask DiT for segmentation enable efficient generation. MMaDA-VLA integrates diffusion in vision-language-action.
What is the Diffusion Transformer for animal motion?
It uses DiT on dense point trajectories from a 300h wild motion dataset. This advances theory in motion synthesis.
What efficiency gains does MDM-Prime-v2 offer?
MDM-Prime-v2 achieves 21.8x efficiency improvements. It contributes to sampler design optimizations.
How does Salt improve video generation?
Salt uses self-consistent cache-aware training for fast video generation. It aligns with distribution matching in diffusion theory.
What art and editing applications use diffusion?
AiS captures structural abstraction in art via stylization proxies, while FlowSlider enables training-free continuous image editing. ONE-SHOT uses spatial-decoupled motion synthesis.
One-step/traj-guided (FlashMotion/V-Co/FlowScene/UniGRPO/VGGRPO, Diffutron masked, GenMask DiT seg, MMaDA-VLA VLA diffusion, FlowSlider edit, AiS art stylization proxy, Salt self-consistent cache-aware fast video, ONE-SHOT spatial-decoupled motion); new Diffusion Transformer animal motion DiT dense point traj 300h wild dataset; Discrete Diffusion MMD, Schrödinger Bridges, MDM-Prime-v2 21.8x eff, UNITE token diffusion, MinerU-Diffusion OCR. Guides RL curricula; ImagenWorld/GEditBench human evals.