Home Explore Pricing Blog Docs New Tracker

Get the App

•

AI Frontier Digest - NBot Tracker | nbot.ai

AI Frontier Digest

Created by Chekhov

1.7K posts

Updated 69 days ago

0 scanned

Cutting‑edge AI research, benchmarks, industry updates, policy insights, and developer tools

Create Similar Tracker

Digest Calendar

May 2026

Sun

Mon

Tue

Wed

Thu

Fri

Sat

New Benchmarks

🔥 HorizonMath: HorizonMath is a new benchmark featuring over 100 predominantly unsolved problems in computational and applied...

March 18, 2026

Trend: Early Mixes and Unified Models Challenge Fine-Tuning for Efficiency

Pushing beyond fine-tuning dominance:

Qianfan-OCR delivers a unified end-to-end model for document intelligence
Fine-tuning seems cheaper but...

March 18, 2026

Google's Sashimi and Gemini: Agentic Trend for Code Review and Workspace

Google's agentic AI trend boosts tech worker productivity:

Sashiko launch: Google engineers' tool for agentic AI code review on Linux Kernel,...

Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel

March 18, 2026·

news.ycombinator.com

March 18, 2026

Microsoft Hires Sequoia-Backed Cove AI Team

Microsoft has hired the full team from Sequoia-backed AI collaboration startup Cove, which is shutting down with service ending April 1 and customer data set for deletion.

Microsoft hires the team of Sequoia-backed AI collaboration platform, Cove

March 18, 2026·

techcrunch.com

March 18, 2026

AI Agent Reliability Crisis: Policy Bans and Jailbreaks Collide

Trend alert: Policy exclusions and technical failures expose urgent AI agent reliability gaps in deployment.

DOD exclusion: Anthropic labeled...

DOD says Anthropic’s ‘red lines’ make it an ‘unacceptable risk to national security’

March 18, 2026·

techcrunch.com

March 18, 2026

Rox AI Surges to $1.2B Unicorn in Sales Automation

Rox AI sprints to a $1.2 billion valuation, cementing unicorn status in AI-driven sales automation since its 2024 founding. This fuels the accelerating AI race for agent-powered sales tools.

Rox AI Roars to $1.2 Billion Valuation: A New Unicorn in the Realm of Sales Automation

opentools.ai

Rox AI Roars to $1.2 Billion Valuation: A New Unicorn in the Realm of Sales Automation

March 18, 2026

Arena Leaderboard: Powerhouse Benchmark Amid Conflicts

Arena, formerly LM Arena, is the de facto public leaderboard for frontier LLMs, influencing funding, launches, and PR cycles.

Billed as the...

The leaderboard “you can’t game,” funded by the companies it ranks

March 18, 2026·

techcrunch.com

March 18, 2026

Manus AI: Desktop Launch to Meta Acquisition Amid China Backlash

Manus AI agent's rapid timeline:

Desktop launch announced amid top AI stories.
Meta acquisition escalates; China restricts Manus executives from...

March 18, 2026

Rivia's €13M Bet on Agentic AI for Clinical Trials

Agentic AI targets clinical trial chaos:

Zurich startup Rivia raises €13M after €3M to unify fragmented data.
Builds AI agents to actively manage...

Rivia raises €13M to bring agentic AI to clinical trials

thenextweb.com

Rivia raises €13M to bring agentic AI to clinical trials

March 18, 2026

Aramco Fuels Middle East Deepfake Defenses

Saudi Aramco's $500M VC arm Wa’ed Ventures makes strategic investment in Resemble AI to expand deepfake detection capabilities across the Middle East. Vital boost for AI ethics and security from regional capital.

Aramco’s Wa’ed Ventures invests in Resemble AI to expand deepfake detection capabilities in Middle East

arabnews.com

Aramco’s Wa’ed Ventures invests in Resemble AI to expand deepfake detection capabilities in Middle East

March 18, 2026

Mamba-3 Crushes Transformers: 4% Better, 7x Faster with Industry Buy-In

Together.ai's open-source Mamba-3 delivers a 4% edge over Transformers on language benchmarks while running 7x faster on long sequences via H100...

New Mamba-3 AI Model Beats Transformers by 4%, Runs 7x Faster

winbuzzer.com

New Mamba-3 AI Model Beats Transformers by 4%, Runs 7x Faster

March 18, 2026

New Benchmarks Expose Frontier LLMs' Math Gaps Toward AGI

Emerging benchmarks highlight LLM struggles in advanced math, lagging humans and revealing eval flaws:

MathVista: GPT-4V tops 12 models at 49.9% vs....

March 18, 2026

GPT-5.1 Emerges as Top LLM for Production Voice Agents

Hands-on test of 5 LLMs: GPT-4.1, GPT-5.1, GPT-5.2, Claude 4.6, Gemini 3 on latency, function calling, instruction following, conversation ability,...

March 18, 2026

Medical AI Evals Unmask Format Flaws and Domain Gaps

PeruMedQA benchmark targets LLMs on Peruvian physician specialty exams, building a fine-tuned dataset to probe real gaps.
Triage failures stem...

March 18, 2026

Open-Source Power Benchmarking Framework for VLMs and LLMs

Open-source framework for performance evaluation of widely used AI workloads in computer vision and large language models
Focuses on power-aware...

Power-Aware Performance Analysis for Vision and Language Models

March 18, 2026·

arxiv.org

March 18, 2026

AI's Fog of War: Retrospective Challenges in War Reasoning

Can AI reason about a war before its trajectory becomes historically obvious? New arXiv paper flags how retrospective analysis complicates benchmarking long-horizon strategic capabilities.

[2603.16642] When AI Navigates the Fog of War - arXiv

March 18, 2026·

arxiv.org

March 18, 2026

OpenAI's $110B Funding: Largest AI Investment Ever

OpenAI's $110 billion funding round marks the largest AI investment in history, reshaping the AI investment landscape in spring 2026 and signaling major shifts ahead.

OpenAI’s $110 Billion Funding Round: The Largest AI Investment in History

mexc.com

OpenAI’s $110 Billion Funding Round: The Largest AI Investment in History

March 18, 2026

Step-Level Evals Push Tool-Using Agents Toward Reliability

Emerging trend: New benchmarks target step-level quality for safe, heavy-duty AI agents.

AgentProcessBench evaluates LLM tool-use in real-world...

March 18, 2026

NVIDIA & Mistral's Nemotron Coalition Challenges Closed AI Giants

NVIDIA and Mistral launch the Nemotron Coalition to deliver frontier-level open-source AI models, pitting Mistral Small 4 and DGX Cloud against...

March 18, 2026

Cognitive Framework for Measuring AGI Progress Hits 49 Points on HN

A new cognitive framework proposes measuring progress toward AGI, sparking discussion with 49 points on Hacker News—a fresh angle beyond standard benchmarks for tech enthusiasts.

Measuring progress toward AGI: A cognitive framework

March 18, 2026·

news.ycombinator.com

AI Frontier Digest

Digest Calendar

Recent Posts