AI Jailbreak Tracker

RSAC/NDSS 2026 spotlights Claude ShadowPrompt/zero-click persuasion/indirect PI & agent defenses + Code leaks + GrafanaGhost

RSAC/NDSS 2026 spotlights Claude ShadowPrompt/zero-click persuasion/indirect PI & agent defenses + Code leaks + GrafanaGhost

Key Questions

What is ShadowPrompt highlighted at RSAC/NDSS 2026?

ShadowPrompt refers to zero-click techniques for XSS, RCE, persuasion, and indirect prompt injection affecting 80-90% of Cursor, Claude, Salesforce, and Gemini models. It enables subtasks in Claude for exploiting vulnerabilities without direct user interaction. These were spotlighted alongside scalable OWASP defenses at the conferences.

What vulnerabilities were exposed in the Claude Code leaks?

Claude Code leaks revealed issues like KAIROS multi-agent flaws, memory exploits, Undercover techniques, poison pills, and a CLAUDE.md SSH vulnerability. A critical flaw made deny rules vulnerable in long workflows, and researchers found the first major vulnerability shortly after an accidental source code leak. These leaks also exposed internal architecture and supply-chain threats.

What is GrafanaGhost and how does it work?

GrafanaGhost is an indirect prompt injection via URL bypass and keyword tricks for silent data exfiltration in AI workflows. It allows attackers to manipulate AI agents without direct access. This technique was highlighted in the context of RSAC/NDSS discussions on agent defenses.

What is the MCP infinite tokens vulnerability?

MCP permits infinite tokens and limit bypasses, creating a critical vulnerability scalable per OWASP guidelines. It was urgently discussed in Cursor community forums. This flaw enables excessive resource consumption and potential exploits in AI coding agents.

What defenses were presented at RSAC/NDSS 2026 for AI agent attacks?

Defenses include Prompt Guard, HiveFence, NemoClaw, Zapier, and AgentWatcher, addressing prompt injection and supply-chain shifts. NDSS papers like Rennervate, Beyond Jailbreak, T-MAP, and Photon were featured, along with OpenAI bounties. These tools mitigate zero-click persuasion and indirect PI in multi-agent systems.

How do Claude Code flaws impact complex engineering tasks?

February updates to Claude Code made it unusable for complex tasks due to inefficient system prompt handling, wasting tokens on long prompts. Users are advised to use --system-prompt-file for better efficiency. Deny rules become vulnerable in extended workflows, exacerbating security risks.

What role do source code leaks play in AI supply-chain threats?

Leaks like Claude Code exposed internal instructions, agent architecture, and vulnerabilities such as SSH issues and multi-agent flaws. They highlight supply-chain shifts and enable attacks like prompt injections in government GenAI use. Videos and analyses deconstruct these leaks, revealing missing critical pieces in agents.

What NDSS topics relate to jailbreaks and AI security?

NDSS 2026 covered Rennervate, Beyond Jailbreak, T-MAP, Photon, and OpenAI bounties for advanced jailbreak defenses. These align with RSAC spotlights on agent hardening amid code leaks. They emphasize verified pentesting for agentic cyberattacks.

RSAC details Claude subtasks/ShadowPrompt zero-click XSS/RCE/persuasion/indirect PI (80-90% Cursor/Claude/Salesforce/Gemini), MCP infinite tokens/OWASP scalable; GrafanaGhost indirect PI via URL bypass/keyword tricks for silent exfil in AI workflows. Claude Code leaks (KAIROS/multi-agent/memory/Undercover/poison pills/CLAUDE.md SSH vuln); NDSS Rennervate/Beyond Jailbreak/T-MAP/Photon/OpenAI bounty; defenses Prompt Guard/HiveFence/NemoClaw/Zapier/AgentWatcher amid supply-chain shift.

Sources (15)
Updated Apr 9, 2026
What is ShadowPrompt highlighted at RSAC/NDSS 2026? - AI Jailbreak Tracker | NBot | nbot.ai