Anthropic's Mythos Model Is Hunting Software Bugs Nobody Else Found

Anthropic's Mythos model is scanning critical software systems for vulnerabilities through Project Glasswing — and it's already found thousands.

Anthropic just previewed its most powerful AI model yet, and it's not chasing chatbot benchmarks. It's going after the vulnerabilities that have been sitting in codebases for twenty years, waiting for someone, or something, to find them.

What Is Mythos

The model is called Mythos. Anthropic described it internally as sitting above its existing Opus tier, calling it "by far the most powerful AI model we've ever developed" in a draft blog that leaked last month after a human error left it in a publicly accessible data cache. The company confirmed the model's existence this week and rolled out a limited preview under a new initiative called Project Glasswing.

Mythos is a general-purpose model, not a purpose-built security tool. Anthropic says its strengths are agentic coding and reasoning, which makes it well-suited to reading, understanding, and stress-testing code at scale. The same capability that makes it useful for writing and debugging software appears to make it capable of thinking through how software might break, which is the core skill you want in a vulnerability scanner.

Project Glasswing and the Partners Behind It

The scope of the initiative is not small. More than 40 partner organizations, including Amazon, Apple, Cisco, Microsoft, CrowdStrike, and the Linux Foundation, are deploying Mythos specifically for defensive security work. The goal is scanning first-party and open-source software systems for code vulnerabilities before bad actors find them first.

The partner organizations participating in Project Glasswing will share what they learn from the deployment, with the stated goal of benefiting the broader tech industry. The preview is not going to be made generally available, Anthropic said.

What It's Already Found

Anthropic says that over just the past few weeks, Mythos has already surfaced thousands of zero-day vulnerabilities, many of them critical, and many of them sitting untouched in codebases for one to two decades.

That last part is worth sitting with. Twenty years is a long time for a bug to hide. It speaks less to any failure on the part of human security researchers and more to the sheer scale of the problem. Modern software systems are enormous, layered, and interconnected in ways that make comprehensive manual auditing essentially impossible at any reasonable pace.

The Risk Hiding Inside the Capability

There is an obvious tension worth naming here. The leaked Anthropic document acknowledged that a model capable of finding vulnerabilities at this scale could just as easily be used to exploit them. That is not a hypothetical risk; it is the exact reason Anthropic is keeping the preview tightly controlled and specifically limited to defensive applications. Whether that control holds as the model eventually moves toward broader release is an open question, and it is the right question to be asking.

A Complicated Launch for Anthropic

The release arrives during a rough stretch for the company. Anthropic recently exposed nearly 2,000 source code files through a mistake in a Claude Code package update, then inadvertently took down thousands of GitHub repositories while trying to contain the fallout. On the regulatory side, Anthropic is in an active legal dispute with the Trump administration after the Pentagon designated it a supply-chain risk for refusing to enable autonomous targeting or surveillance of U.S. citizens. Anthropic says it has been in ongoing discussions with federal officials about Mythos, though the context around those conversations is murky given the ongoing legal fight.

Why This Actually Matters

What the Mythos preview signals, stripped of the surrounding noise, is that AI's most consequential near-term applications may not be consumer-facing at all. The infrastructure layer, the code that runs hospital billing systems, financial networks, and the open-source libraries every startup builds on, is deeply exposed. Decades of accumulated technical debt and under-resourced security teams have left the foundation shakier than most people realize. If a model like Mythos can close some of that gap systematically and at scale, that matters a lot more than another chatbot benchmark.