Home Explore Pricing Blog Docs New Tracker

Get the App

•

ML Research Pulse - NBot Tracker | nbot.ai

ML Research Pulse

Created by Mayssa Haddar

33 posts

Updated 2m ago

120 scanned

Cutting‑edge ML theory, algorithms, and model architecture updates from top conferences and labs

Create Similar Tracker

Highlights for you

Anthropic Claude Mythos Capabilities

Emergent OS/software vuln detection crushes benchmarks; restricted to US firms (MSFT, Apple, Cisco, Nvidia/Google) via Project Glasswing over hacking fears; Claude Code usage spikes.

1 source

Use arrow keys to navigate

Digest Calendar

April 2026

Sun

Mon

Tue

Wed

Thu

Fri

Sat

Recent Posts

Explore the latest content tracked by ML Research Pulse

2 min ago

27B Model Tops 397B Giant and MiniMax-M2.5 on SWE-Bench

27B model outperforms a 397B model and MiniMax-M2.5 on the SWE-Bench coding benchmark, sparking debate: real efficiency breakthrough or benchmaxxed?

2 min ago

Convergent Evolution: LMs Develop Similar Number Representations

Different language models independently learn similar internal representations for numbers, revealing convergent evolution in their architectures.

Convergent Evolution: How Different Language Models Learn Similar Number Representations

arxiv.org

Convergent Evolution: How Different Language Models Learn Similar Number Representations

2 min ago

4h ago

ReImagine: Image-First Synthesis for Controllable Human Video Gen

ReImagine rethinks controllable high-quality human video generation via image-first synthesis.

ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis

arxiv.org

ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis

4h ago

SAVOIR: Shapley-based Rewards for Social Agent Learning

SAVOIR proposes Shapley-based reward attribution to learn social savoir-faire in agents.

SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

arxiv.org

SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

4h ago

Rising AI Agents for Automated Research and Paper Grading

Trend alert: Google and Sakana AI Labs push specialized agents automating scientific workflows—from reports to evaluations.

Google's Deep...

4h ago

Sakana AI's Prompt-Only Solution for LLM Coin Tosses Accepted to ICLR2026

Sakana AI solves the deceptively deep challenge of LLMs performing fair internal coin tosses using prompts alone. Their paper "SSoT: Prompting LLMs for Distribution-Faithful and Diverse Generation" was accepted to #ICLR2026.

4h ago

Claude Code Tops Hugging Face Coding Agents by Size

Claude Code is 4x bigger than Codex
Codex is 2x bigger than Cursor
Antigravity nearly matches Cursor, likely due to Googler usage

One HF Hub data point – tagging for more agents soon.

4h ago

Self-Evolving Frameworks Boost LLM Agent Efficiency

Emerging trend in self-evolving LLM techniques:

Memory extraction across heterogeneous tasks
Efficient terminal agents via observational context compression
Watch for advances in agent memory management.

Self-Evolving LLM Memory Extraction Across Heterogeneous Tasks

arxiv.org

Self-Evolving LLM Memory Extraction Across Heterogeneous Tasks

4h ago

1.7B Model Crushes 744B GLM-5 on Schema-Guided Dialogue—Even with Corrupted Data

A 1.7B parameter model beats GLM-5 (744B) on Schema Guided Dialogue—even when training data is corrupted—a 437x size difference showcasing data-efficient small models' dominance.

4h ago

Reward Hacking Mechanisms in Large Models Spotlighted

New paper unpacks reward hacking mechanisms, emergent misalignment, and challenges in the era of large models. Join the discussion on this key LLM alignment topic.

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

arxiv.org

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

4h ago

LLaDA2.0-Uni: Diffusion LLM Unifies Multimodal Tasks

LLaDA2.0-Uni introduces a Diffusion Large Language Model that unifies multimodal understanding and generation. Breakthrough for seamless multimodal LLMs from core AI research.

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

arxiv.org

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

4h ago

GPT-5.5 Leaked on Codex Signals OpenAI's Next Model Prep

GPT-5.5 and internal model names spotted on Codex—a classic sign OpenAI is gearing up for a new release. Eyes on the horizon for their next major leap.

OPENAI : GPT-5.5 and a bunch of internal model names have been ...

4h ago·

threads.com

6h ago