Anthropic Built an AI So Dangerous It Won’t Let You Use It

Anthropic restricts Claude Mythos Preview to 12 tech giants after AI discovers thousands of zero-day exploits autonomously

Apr 8, 2026

2 min read

Key Takeaways

Anthropic restricts Claude Mythos Preview after discovering thousands of vulnerabilities autonomously
Project Glasswing partners with 12 major tech companies using $100 million credits
AI model found 27-year-old OpenBSD flaw and 16-year-old FFmpeg bug independently

Your cybersecurity assumptions are about to get shredded by an AI model that’s too powerful for public release. Anthropic just announced Project Glasswing, pairing its restricted Claude Mythos Preview with 12 major tech companies to hunt zero-day vulnerabilities before the bad guys find them. This isn’t another AI safety theater production—it’s damage control for what happens when machines surpass human hackers.

The Ultimate Penetration Testing Machine

Claude Mythos Preview autonomously discovered thousands of vulnerabilities across every major operating system, including a 27-year-old flaw in OpenBSD and a 16-year-old bug in FFmpeg that survived five million automated tests. Unlike previous models that needed human guidance, Mythos writes working exploits entirely on its own. Your “secure” Linux kernel? The AI found multiple chained exploits for full system compromise.

According to Anthropic’s Newton Cheng, this model “can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.” Translation: Your security team just became obsolete.

A $100 Million Bug Bounty Program on Steroids

Project Glasswing operates like an exclusive hacker collective with corporate backing. Twelve launch partners—including Amazon, Apple, Google, and Microsoft—get direct access, while 40 additional organizations maintain critical software under the program. Anthropic committed $100 million in usage credits during the preview period, plus $4 million to open-source foundations.

The disclosure pipeline includes manual validation before any bug reports reach overwhelmed maintainers. Think of it as having professional triagers filter an avalanche of vulnerability reports—because apparently even responsible AI deployment can accidentally DOS the people trying to fix things.

Racing Against Proliferation (And Credibility Issues)

Here’s where timing gets suspicious. Anthropic warns that similar AI capabilities will spread to competitors within 6-18 months, making 2026 attacks “significantly more likely.” The company briefed government officials that the economic and national security fallout could be severe. Meanwhile, days before this announcement, Anthropic suffered two embarrassing security incidents—a CMS misconfiguration that exposed 3,000 internal assets and an npm packaging error that leaked 512,000 lines of source code.

Nothing says “trust us with dangerous AI” quite like accidentally publishing your own internal documents about that dangerous AI.

The Transparency Gamble

Anthropic is betting that controlled sharing of Mythos Preview will create enough defensive advantage before hostile actors develop similar capabilities. It’s essentially flooding the zone with friendly hackers before the unfriendly ones show up to the party. Whether transparency can outrun proliferation remains the multi-billion-dollar question—along with whether your data survives the experiment.