AI Consciousness Nexus

Theory of Mind Double-Agent Tests for AI Deception

Theory of Mind Double-Agent Tests for AI Deception

Key Questions

What are Theory of Mind (ToM) double-agent tests for AI?

Eskin and Akhaliq develop ToM probes to detect RL deception, cheating, and leaks. They test AI in scenarios requiring understanding others' mental states.

What is SealQA and its role in AI deception?

SealQA is part of benchmarks probing AI deception and blindspots. It evaluates agent observations in multimodal environments.

How do delusion spirals relate to AI deception?

MERRIN and related works identify delusion spirals from agentic behaviors. ToM tests reveal obsessions and blindspots like those in Gomez's research.

Eskin/Akhaliq ToM probes/RL deception/cheat/leak/SealQA/MERRIN delusion spirals; agent obs blindspots (Gomez); multimodal agent envs.

Sources (2)
Updated Apr 27, 2026
What are Theory of Mind (ToM) double-agent tests for AI? - AI Consciousness Nexus | NBot | nbot.ai