AI security & trust enablers surge — risks amid agent scaling

Key Questions

Why did Anthropic block free Claude access on OpenClaw?

Anthropic ended free Claude use via third-party apps like OpenClaw due to security risks, including 36% exfiltration rates. API or paid access is now required.

What was the Claude Code leak?

Anthropic accidentally leaked Claude Code's source, revealing a Tamagotchi-style pet and always-on agent features. It exposed memory systems and internal capabilities.

What is PentAGI?

PentAGI is an open-sourced fully autonomous AI red team for security testing. It autonomously identifies vulnerabilities in agent scaling.

What does TrojAI offer?

TrojAI extends its platform for securing AI environments with red teaming and runtime intelligence. It protects against threats in applications and tools.

What is Moonbounce?

Moonbounce raised $12M for real-time AI content moderation with 300ms checks. It addresses trust issues in scaling AI content.

How does Coder enhance AI security?

Coder's $90M-funded platform provides secure environments for enterprise AI development, integrating with Wiz, Cloaked, and Snyk.

What local options reduce security risks?

HF.js, PrismML Bonsai, Gemma 4 on Jetson enable on-device processing, minimizing cloud exfiltration risks. They support private, local agent deployment.

What is OpenAI doing for AI safety?

OpenAI opened applications for an external AI safety research fellowship to fund work on risks amid agent scaling.

Anthropic OpenClaw blocks (36% exfil/JustPaid); Claude leaks; Copilot ads; TrojAI red teaming/runtime intel; PentAGI autonomous red team; Coder/Wiz/Cloaked/Snyk; HF.js/PrismML/Gemma 4/Jetson local; Moonbounce $12M moderation (300ms checks).

Sources (10)