Claude Mythos Preview: AI Finds Its Own Zero-Days

181 Firefox exploits, a 27-year-old OpenBSD bug, a full FreeBSD RCE with a ROP chain. Why the offense-defense balance of the security community is shifting right now.

On April 7, 2026, Anthropic published an assessment of the cybersecurity capabilities of the new Claude Mythos Preview. The tone is unusually direct for a lab that normally hedges: the model finds and exploits zero-day vulnerabilities in production software largely without human guidance, at a scale that would have sounded unrealistic a month ago.

"Less than a month ago we wrote that Opus 4.6 was better at identifying and fixing vulnerabilities than at exploiting them. These capabilities have emerged very quickly." – Anthropic Red Team

The Jump, in Numbers

The most important data point in the report isn't any single bug — it's a ratio. Between Opus 4.6 and Mythos Preview there are not weeks of engineering, but an order of magnitude in capability:

181

working Firefox exploits (Mythos Preview)

working Firefox exploits (Opus 4.6)

Tier-5 cases (full code control) on OSS-Fuzz

The 90x jump on Firefox isn't the most interesting part. What matters more is that after the initial prompt, Mythos Preview needs no further human intervention. An engineer with no specialized security training can file a task in the evening and wake up to a working exploit.

What the Model Actually Found

OpenBSD SACK implementation 27 years old

Mythos Preview identified a flaw in the Selective Acknowledgement logic that allows a signed integer overflow via TCP sequence numbers. The result: a TCP-based denial of service against an operating system widely treated as a reference for secure network stacks.

FFmpeg H.264 codec 16 years old

A 65,536-slice collision triggered out-of-bounds writes. The remarkable thing isn't the bug itself but the fact that it survived this long in one of the most heavily fuzzed codebases in the world.

FreeBSD Remote Code Execution 17 years old

The most striking case in the report. The model autonomously built an exploit over six RPC requests, bypassed authentication, accounted for KASLR randomization, and assembled a 20-gadget ROP chain distributed across multiple packets — no follow-up prompts, no manual corrections.

Linux kernel privilege escalation

Mythos Preview chained three to four separate vulnerabilities to bypass KASLR and obtain root, including subtle race conditions and complex heap sprays. Chaining multiple bugs into a full attack path used to be one of the clearest dividing lines between "automated bug finder" and "human exploit author."

Why This Is More Than "Better Fuzzers"

Fuzzing tools have been finding bugs for years. Two things are new. First, the scalability of the search: the model can look for vulnerabilities in parallel across every relevant file, including the ones human reviewers intuitively assume must already have been checked by someone. That's exactly where the 16-, 17- and 27-year-old bugs live.

Second, chaining. Finding a use-after-free is one thing. Combining it with an info leak, defeating KASLR, building a ROP chain, and keeping all of it stable across multiple network packets used to be a deeply human discipline. Mythos Preview does both in one continuous run.

"After two decades of stable equilibrium, language models could destabilize this precarious balance."

What This Means for Defenders

The report is explicitly not a product announcement; it's an early warning. Anthropic pairs it with an initiative called Project Glasswing, intended to harden critical software before similarly capable models become broadly available. The operational consequences come down to a short list:

For security and IT teams

Shorten patch windows. A 30-day cycle assumes an attacker with older tools than the model in this report.
Enforce auto-updates wherever you can defend the trade-off. The old cost-benefit math has shifted.
Take defense in depth seriously. KASLR, StackProtector, W^X, sandboxing all remain effective — precisely because the model breaks systems that lack them.
Deploy AI-assisted bug finding now. Available models like Opus 4.6 can already scan your own codebase, before an attacker does the same with a Mythos-class model.
Review disclosure processes. If the volume of reported bugs jumps, triage becomes the bottleneck.

What It Means for Companies Using AI

My clients have been asking me regularly whether they should run AI agents in production. The Mythos Preview report doesn't change the answer, but it changes the priority list. Anyone running AI in production should check three things over the next few weeks:

Supply-chain exposure. Which open-source components sit deep inside your AI workflows? Are your dependency scanners up to date? A 17-year-old FreeBSD RCE begs the question of how many similar bugs are still sitting in libraries you never touched directly.
Sandbox architecture. Agents that execute code, read files or open network connections need real isolation. I covered this in the OpenClaw vs. NemoClaw article. Mythos Preview makes it more urgent, not less.
Incident-response pipeline. When a patch drops on a Friday at 5pm, who deploys it? The shorter the window between disclosure and exploitation, the less forgiving the process is to manual steps.

What the report does not say

Mythos Preview is not generally available. The tests ran in a controlled environment with Anthropic's own safeguards. The model is not an autonomous attacker loose on the internet. But the capability exists, and the history of the security community is clear: what works in a lab today tends to work outside of it not long after.

Bottom Line

The actual finding of the report isn't "AI finds zero-days" — that was true in a limited sense before. It's the speed at which this capability is compounding. Between "assists with analysis" and "autonomously exploits with a ROP chain over six RPCs" was less than a month of model development. If you're planning security architecture, assume this trend continues.

This isn't a panic message. It's an invitation to look at your own patching, disclosure and isolation processes honestly — and to build them as if an attacker with Mythos-class capabilities is already in the room.

Source: Assessing Claude Mythos Preview's cybersecurity capabilities (Anthropic Red Team, April 7, 2026)

You're running AI in production and want to know where your supply chain and agent architecture actually stand? Get in touch — I help with evaluation and hardening.

Claude Mythos Preview: When AI Finds Zero-Days on Its Own