Home Explore Pricing Blog Docs New Tracker

Get the App

•

AI Red Teaming Hub - NBot Tracker | nbot.ai

AI Red Teaming Hub

Created by George Kour

641 posts

Updated 116 days ago

0 scanned

peer-reviewed papers and reports on AI red teaming, agent safety, and design patterns

Create Similar Tracker

Digest Calendar

July 2026

Sun

Mon

Tue

Wed

Thu

Fri

Sat

New Agentic Frameworks

🔥 Nvidia GTC Announcements: Nvidia announced the Nemotron Coalition with Mistral AI and others for open frontier AI...

March 18, 2026

Sashiko: Google's Agentic AI for Linux Kernel Review, Ripe for Safety Evals

Google engineers launched Sashiko, an agentic AI for Linux kernel code review—a domain-specific agent ideal as a benchmark for reward hacking and API abuse in software automation evals like FinToolBench. Buzzing at 83 points on Hacker News.

Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel

March 18, 2026·

news.ycombinator.com

March 18, 2026

Claude Dispatch: Sandboxed Local Agent Lowers Deployment Barriers

Phone-controlled desktop agent: Text Claude to access files, browse, build reports, execute tasks
Sandboxed & local with user approval before...

producthunt.com

Claude Dispatch

March 18, 2026

Nvidia's OpenClaw Accelerates Agentic AI—Red Teamers Take Note

Nvidia's GTC push commoditizes agentic frameworks, opening new red teaming frontiers:

OpenClaw hailed as 'next ChatGPT' by Huang, open source base...

March 18, 2026

UseAgents: Real-Time Registry for Agent Tool Discovery

UseAgents tackles LLMs' frozen knowledge limiting tool access via a real-time registry where developers define tools and APIs for instant agent discovery and use. No scraping or guessing—just structured tools powering the agentic web.

producthunt.com

UseAgents

March 18, 2026

SAFE-MCP: MITRE ATT&CK Framework for Securing Tool-Using Agents

Emerging SAFE-MCP standardizes security for production tool-using AI agents via MCP, tackling threats like tool poisoning, prompt injection misuse,...

March 18, 2026

Structured Semantic Cloaking Targets LLM Latent Defenses

Modern LLMs deploy safety beyond surface filters, into latent semantic representations. Structured Semantic Cloaking emerges as a potent jailbreak attack exploiting these deeper layers.

Structured Semantic Cloaking for Jailbreak Attacks on Large ... - arXiv

March 18, 2026·

arxiv.org

March 18, 2026

Mistral AI Releases Forge

Mistral AI launches Forge, quickly hitting 565 points on Hacker News – prime signal for agent devs eyeing new architectures and tooling.

Mistral AI Releases Forge

March 18, 2026·

news.ycombinator.com

March 18, 2026

TRUST-SQL: Multi-Turn RL Boost for Tool-Integrated Text-to-SQL on Unknown Schemas

TRUST-SQL introduces tool-integrated multi-turn reinforcement learning for Text-to-SQL over unknown schemas, advancing robust agent handling of dynamic DB environments. Join the paper discussion.

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

arxiv.org

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

March 18, 2026

SFCoT: Proactive Real-Time Defense Against CoT Jailbreaks

Key angles on SFCoT for LLM safety:

CoT vulnerability: LLMs excel at reasoning but succumb to jailbreak attacks via prompt injection, bypassing...

March 18, 2026

MiroThinker-1.7 & H1: Verification Powers Heavy-Duty Research Agents

MiroThinker-1.7 & H1 introduces verification techniques to build robust, heavy-duty research agents. Join the paper discussion for deeper insights.

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

arxiv.org

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

March 18, 2026

Rising Specialized Benchmarks for LLM Agent Evals

Emerging trend in agentic evaluation frameworks:

PostTrainBench tests LLM agents like Claude Code automating post-training on Qwen/Gemma across...

March 18, 2026