Anthropic Claude 4.7 & Mythos/Glasswing/NSA Launch
Key Questions
How does Claude 4.7 perform on public benchmarks?
Claude 4.7 leads in coding and vision benchmarks using prompt tweaks, despite some tokenizer issues. This positions it ahead of competitors in public evaluations.
What is Mythos and why is it being withheld?
Mythos is a model withheld from public release for elite cyber applications, partnered with Glasswing and deployed by the NSA despite Pentagon bans and internal feuds. This highlights tensions in government AI adoption.
What opportunities does this create for security startups?
The developments serve as a wake-up call, emphasizing opportunities for security startups amid rapid government adoption of AI and perceived hypocrisies in safety practices. Related insights from Unit 42 underscore cyber risks from frontier models.
Claude 4.7 leads public benchmarks (coding/vision) with prompt tweaks but tokenizer issues; Mythos withheld for elite cyber (Glasswing partners, NSA deployment despite Pentagon bans/feuds), wake-up call emphasizes security startup opps amid gov't adoption and safety hypocrisies.