Video-MME-v2 Advances Video Understanding Benchmarks
Video-MME-v2 marks the next stage in benchmarks for comprehensive video understanding.

Created by Aleah Desiree
Latest AI/ML research papers from arXiv and major conferences
Explore the latest content tracked by AI Paper Tracker
Video-MME-v2 marks the next stage in benchmarks for comprehensive video understanding.
New paper In-Place Test-Time Training explores efficient test-time adaptation. Join the discussion on this paper page.
New arXiv paper Demystifying When Pruning Works via Representation Hierarchies unpacks pruning effectiveness, vital for model compression – join the discussion.
MegaTrain enables full-precision training of 100B+ parameter LLMs on a single GPU, democratizing massive model development through extreme hardware efficiency.
Major update to flow map language models – researchers hail it as the future of non-autoregressive text generation. Introduces new class of continuous flow-based models, detailed in fresh paper and blog.
🚨 Alarming preprint: In RCTs, just 10 minutes of AI assistance makes people perform worse and give up more often than those without AI. A stark warning for education and productivity tools.
In a key advance for open speech AI, the Open Whisper-style Speech Model (OWSM) has reproduced OpenAI's Whisper as an initial step using publicly available data and open-source toolkits.
Rapid advances in AI agent foundations:
New post-training technique for coding LLMs: simulate test execution to verify and fix their own code.
AURA enables always-on understanding and real-time assistance via video streams, marking a breakthrough in continuous multimodal perception for proactive AI support.
LIBERO-Para introduces a diagnostic benchmark and metrics to expose paraphrase robustness vulnerabilities in VLA models. Key for advancing reliable vision-language-action systems.
LightThinker++ marks a shift from reasoning compression to memory management in LLM efficiency. Key new paper for next-gen reasoning optimization—join the discussion.
Today's fresh arXiv drops spotlight hybrid LLMs and distillation:
Critical cloud vulnerability: Remote attackers reconstruct DNN architectures on NVIDIA GPUs via execution traces—no access needed.
Trend alert: AI skeptics underscore LLMs' reasoning limits, advocating symbolic alternatives.
Key failures in AI unlearning expose ongoing privacy vulnerabilities:
New arXiv paper introduces learnable adaptation policies enabling language agents to learn at test-time. Join the discussion.
Amid rapid AI advancements, computer vision-based disease detection stands out as a reliable and scalable alternative in this new deep learning comparative study.
Novel DBT-DR-GAN advances medical imaging with a three-stage pipeline for high-fidelity CT-to-MRI translation: