Tech Policy Science Brief

Anthropic’s allegations that Chinese AI companies illicitly distilled Claude, and technical/security work on detecting distillation attacks

Anthropic’s allegations that Chinese AI companies illicitly distilled Claude, and technical/security work on detecting distillation attacks

Anthropic vs Chinese Labs Distillation Fight

Anthropic Accuses Chinese AI Companies of Illicitly Distilling Claude; Industry Responds with Detection and Prevention Techniques

In the evolving landscape of global AI competition, recent allegations by Anthropic have spotlighted the concerning practice of distillation attacks—where Chinese AI firms are accused of illicitly extracting and refining capabilities from models like Claude to bolster their own systems. These activities threaten to undermine U.S. technological dominance and escalate the risk of militarized AI proliferation.

The Allegations: Chinese Firms Illicitly Distilling Claude

Anthropic has publicly accused several Chinese AI companies, including DeepSeek and MiniMax, of using distillation techniques to siphon proprietary features from Claude. According to Anthropic, these firms employed sophisticated reverse engineering and model extraction methods to obtain illicit access to Claude’s advanced capabilities. The company detailed how these Chinese entities leveraged distillation—a process where a smaller or less secure model is trained to mimic the behavior of a larger, proprietary model—to illicitly improve their own models.

An article from CyberScoop reports that Anthropic claims DeepSeek, MiniMax, and other Chinese labs "used Claude to improperly obtain capabilities to improve their own models," effectively bypassing legal and ethical boundaries. Similarly, Anthropic has detailed how Chinese firms attempt to reverse engineer large language models (LLMs) like Claude using sophisticated distillation techniques to gain strategic advantage.

Another recent article highlights that Chinese AI companies "distilled" Claude to enhance their models, raising alarms over technological theft and proliferation. These activities are viewed as part of a broader trend where state and private actors seek to accelerate their AI capabilities through illicit means, risking technological diffusion into militarized domains.

Industry and Community Response: Detecting and Preventing Distillation Attacks

In response, the AI community and industry leaders are actively developing techniques to detect and prevent distillation attacks. A prominent example is a Hacker News discussion titled "Detecting and Preventing Distillation Attacks", which emphasizes advanced detection algorithms capable of identifying model extraction efforts in real time.

Companies like Google are at the forefront of creating detection and attribution tools aimed at curbing illicit model theft. These systems analyze behavioral patterns, training data anomalies, and model response signatures to identify suspicious extraction activities. For instance, CodeLeash, a recent safety framework, is designed to ensure autonomous agents operate ethically and reliably, reducing the risk that distilled models could be used maliciously.

Furthermore, industry advocates stress the importance of robust security protocols and legal frameworks to discourage illicit distillation. As one expert notes, "Developing real-time detection tools is crucial to safeguard proprietary AI capabilities from reverse engineering and unauthorized extraction." These efforts aim to maintain strategic superiority and prevent the spread of militarized AI that could destabilize geopolitical balances.

The Broader Context: Geopolitical and Security Implications

The allegations and countermeasures come amid heightened geopolitical tensions. The U.S. government has imposed restrictions on Anthropic's models, citing national security concerns, and is exploring alternative providers such as OpenAI, which recently secured a Pentagon contract. Meanwhile, Chinese AI firms continue to accelerate their development, with DeepSeek’s upcoming V4 model expected to transform surveillance and autonomous military systems across Asia.

The diffusion of large language models and multimodal systems like V4 and Qwen3.5 has democratized access to advanced AI capabilities, but also amplified risks—especially when illicit distillation enables adversarial actors to embed proprietary features into military hardware or autonomous systems.

Moving Forward: Safeguarding AI Technologies

To address these threats, policy initiatives are underway. The U.S. government has emphasized responsible AI deployment through export controls and advanced detection capabilities. Internationally, nations are calling for binding treaties and norms to regulate autonomous weapons and prevent destabilizing arms races.

The challenge remains to balance innovation and security. As private investments pour into defense AI startups—such as NODA AI and MatX—the importance of detecting illicit activities becomes critical. Detection and attribution tools will be essential in maintaining strategic advantage and preventing the spread of militarized AI through illicit distillation.

Conclusion

The recent allegations by Anthropic have cast a spotlight on distillation attacks as a significant threat to AI security and national interests. While industry responses are advancing detection and prevention techniques, the geopolitical stakes continue to escalate. The next phase of AI development must prioritize robust safeguards, international cooperation, and ethical standards to prevent the illicit proliferation of powerful models like Claude and avoid a destabilizing AI arms race. The world stands at a crossroads: how effectively global actors can confront and mitigate these threats will shape the future of AI-enabled security and warfare.

Sources (5)
Updated Mar 1, 2026