Automated Vulnerability Chaining

Too Dangerous to Release? Cloudflare Warns Over Anthropic’s Cyber AI “Mythos”

Anthropic, Anthropic AI, Claude Mythos, Cloudflare security, Anthropic Mythos AI vulnerability chaining, Cloudflare warns about cyber AI risks, Cloudflare warns about Mythos AI, Mythos, Mythos AI, AI
Facebook
X
LinkedIn
Reddit
WhatsApp
Source: gguy / Shutterstock.com

After conducting internal tests, Cloudflare is warning that Anthropic’s new AI model Mythos may be capable of autonomously chaining together harmless software flaws into severe cyberattacks.

Global cloud and internet infrastructure provider Cloudflare has released detailed findings from its evaluation of “Mythos Preview,” an advanced cybersecurity-focused AI model developed by Anthropic. In an official blog post published Monday, Cloudflare Chief Security Officer Grant Bourzikas described the system as highly capable of automatically discovering and exploiting software vulnerabilities.

Ad

According to Cloudflare, the model’s technical sophistication and inherent abuse potential make a public release premature. The company argues that stronger safeguards and operational restrictions must be implemented before broader access is granted. As a core provider securing major parts of the global internet, Cloudflare has a direct interest in preventing infrastructure failures and supply-chain attacks that could impact millions of customers worldwide.

Mythos Shows Major Leap in Contextual Analysis

To assess Mythos under real-world conditions, Cloudflare integrated the model into an internal testing environment and deployed it against more than 50 production repositories. The analyzed codebases included critical infrastructure components, large-scale networking systems, internal platform tooling, and widely used open-source software packages.

Previous cybersecurity-focused language models often struggled to connect abstract logical flaws across sprawling codebases. In practice, they typically produced isolated line-by-line findings and generated overwhelming numbers of false positives that required extensive manual review by security teams.

Ad

Cloudflare says Mythos demonstrated a significant shift in both processing quality and strategic context awareness.

AI Can Chain Low-Risk Bugs Into Critical Exploits

The most significant advancement appears to be Mythos’ ability to perform so-called “vulnerability chaining.” According to Bourzikas, the system can autonomously identify multiple unrelated low-severity bugs that would normally remain unresolved in engineering backlogs for months.

Rather than treating these flaws independently, Mythos combines them into a single high-impact attack path capable of escalating into a serious compromise.

More concerning for defenders, the AI reportedly does not stop at theoretical analysis. Cloudflare says Mythos directly generated functional proof-of-concept (PoC) exploit code during testing.

Bourzikas noted that the model’s reasoning process and intermediate analysis steps resembled the workflow of an experienced senior security researcher rather than the rigid output patterns of traditional automated scanners. While this could dramatically reduce threat validation time for enterprise security teams, it may also shorten the defensive preparation window before attackers begin leveraging similar AI-generated exploit chains.

Simple Prompt Tweaks Bypassed Safety Guardrails

Despite the model’s advanced capabilities, Cloudflare’s investigation also exposed major weaknesses in Mythos’ internal safety guardrails.

During testing, researchers found that the model behaved inconsistently and could be manipulated through relatively simple prompt-engineering techniques.

In one documented example, Mythos initially refused to analyze a provided code snippet because the request allegedly violated policies designed to prevent malware development. However, researchers then removed the hidden .git configuration directory from the repository without changing a single line of source code. On the second attempt, the AI accepted the exact same request and performed a deep vulnerability analysis without hesitation.

Cloudflare also noted that identical prompts frequently produced different outcomes due to the probabilistic nature of large language models. According to the company, this inconsistency makes the system unsuitable for uncontrolled mass-market deployment in its current state.

Human Researchers Still Outperform AI in Deep System Analysis

At the same time, the tests highlighted the current limitations of autonomous AI-driven security research.

Cloudflare says human analysts still outperform AI systems when conducting large-scale investigations across extremely complex codebases. Experienced researchers can maintain focus on a specific attack vector, application workflow, or vulnerability class over extended periods while manually reconstructing logical trust boundaries throughout an entire system architecture.

AI models, by contrast, continue to struggle with context overload when performing unrestricted free-text analysis across massive repositories, often causing throughput and reasoning quality to degrade significantly.

As a result, Cloudflare does not currently view Mythos as a fully autonomous security analyst. Instead, the company describes it as a highly efficient copilot for experienced human researchers who already possess contextual knowledge and can guide the system toward relevant attack surfaces.

Project Glasswing: Controlled Global Access Program

The controlled testing initiative is part of a broader security strategy by Anthropic. The company first announced Mythos in April 2026 but deliberately declined a public release after internal testing reportedly showed the model could autonomously discover zero-day vulnerabilities in widely used operating systems and web browsers.

Under an initiative called “Project Glasswing,” Anthropic has instead granted exclusive access to roughly 40 global partner organizations. Participants reportedly include infrastructure providers such as Amazon Web Services (AWS), Microsoft, Cloudflare, and several major international banking institutions.

The goal is to allow defenders to proactively harden critical software infrastructure before comparable cyber-focused AI systems inevitably emerge on unregulated underground markets accessible to criminal actors.

Debate surrounding the disclosure of findings from Project Glasswing has intensified in recent months, particularly as international financial and cybersecurity regulators such as the Financial Stability Board push for stronger global coordination to preserve systemic stability.

Lisa Löw

Lisa

Löw

Junior Editor

it-daily.net

Ad

Artikel zu diesem Thema

Weitere Artikel