The Post-Singularity World

Cyber Security

April 13, 2026

A series of recent reports about cybersecurity vulnerabilities discovered by next-generation AI systems (including Anthropic's announced Mythos model) offer a good illustration of the risks that rapid advances in AI capabilities can pose to society, and the direction that cyber security teams in organisations are headed.

The pattern itself is not new. When OpenAI published GPT-2 in 2019, the debate about whether it was "too dangerous to release" marked an early turning point: the moment AI capabilities became powerful enough that their release started to raise questions around security thinking.

Back in 2019 the concerns were if releasing the GPT-2 model without proper guardrails, might not enable malevolent attackers to create chat systems that could pass for humans in a world where people had started to increasingly worry about bots manipulating online sentiment and elections. [6] But at the same time the staged release of GPT-2 was widely ridiculed as a publicity stunt.

What has changed since then is that the focus has become more technical. Research from Google DeepMind, initiatives like the DARPA AI Cyber Challenge [5], alongside early previews from Anthropic’s Mythos model [1] suggest that next-generation are capable of discovering thousands of previously unknown vulnerabilities in operating systems, browsers, and software that also underpin the day-to-day infrastructure our society relies on to function: power grids, financial systems, government networks, and national defense systems.

Some of these flaws have gone undetected for decades. Not because they were unimportant, but because many of these vulnerabilities are obscure and highly technical, multi-step exploits buried deep in millions of lines of code.

One of the many examples (that serves as a good example of how technical some of these cases are) is a 27-year-old vulnerability in OpenBSD that was recently found by Anthropic's new Mythos model [1] [2]. The issue involved a detailed bug in the way OpenBSD handles a sequence of data packet manipulations. Which could allow an attacker to crash the system remotely without authentication. OpenBSD is an operating system widely used in firewalls, routing infrastructure, and other security-critical environments with a reputation of being one of the most security-hardened operating systems.

Another example is a new vulnerability found in the 16-year old FFmpeg codec, that is widely used to play and process videos in webbrowsers, streaming platforms, and media applications. The flaw allows an attacker to construct a malicious video file that causes the decoder to write data outside its allocated memory bounds, potentially crashing the application.

Of course we should be skeptical to not blindly take over self-published reports from an AI company creating hype around the capabilitires about its state of the art models in the middle of a press campaign. These "thousands" of newly discovered vulnerabilities have not been verified by external experts, nor discussed publically in detail for security reasons, making it hard to accurately estimate their relative importance. Anthropic's own analysis also acknowledged that some of these bugs are not critical vulnerabilities, and that it would be challenging to turn some of these vulnerability into a functioning exploits.

But when every state of the art model is only aible to find only a few new security vulnerabilities, it is still problematic when these new model are instantly released on the market

Historically, finding vulnerabilities like these required such exhaustive technical analysis that it would take highly skilled researchers week to find something like this. Because they were both difficult to find and difficult to exploit, their complexity provided a natural barrier against most attackers.

But that barrier largely disappears for AI agents. An AI system can enumerate thousands of attack paths continuously, iterating in a fully automated way until it finds a way in.

The most effective defense against this is to fight fire with fire. By counter-deploying the most advanced AI systems in highly controlled and secured environments, and making them continuously probe critical infrastructure, we can detect these security vulnerabilities before adversaries do.

To make sure we detect these security vulnerabilities in organisation before attackers exploit them, we need to make sure that security researchers at these organisations have early access to the AI tools that become commercially available to potential adversaries several months in the future. If the most advanced models are available on the open market already, while defensive infrastructure still runs on older generations, the technology gap itself becomes a strategic vulnerability.

Anthropic has indicated that its new Mythos model will be made available to a select group of organizations including AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks to strengthen the security of their critical software, before their new model will be made available to the public.

This kind of structured, high-trust access is a promising model for how frontier AI capabilities can be released responsibly, rather than calling for development to halt, it ensures the most capable tools reach defenders first.

But Anthropic's decision to limit Mythos's release also places an unusual amount of power into the hands of the company, as it is effectively deciding who gets access. Some security experts and open-source advocates argue the world would be safer when the latest models would be release open-source, so that every defender could benefit. But forcing commercial companies to release the source code of their latest models to the world makes it impossible for them to commercialise their technological advances and removes the incentive for them to invest billions into technology research. At the same time, the massive hardware required to run these models still limits who can actually run these models in practice, while an open-source release would also make these models immediately available to attackers.

Under the EU AI Act, the most advanced general-purpose AI models are only classified as posing “systemic risk” if their training exceeded an arbitrary compute thresholds, set for the moment to 10^25 FLOPS. This arbitrary threshold does not reliably predict the risk of a model. Future models could be capable enough to pose a risk while being trained more efficiently, using fewer FLOPs — or vice versa.

One the training process fo a model goes over this threshold, the development of such a models is then subject to additional obligations, including mandatory risk assessments, adversarial testing, cybersecurity safeguards, and incident reporting.

While the Act does reflects a growing recognition that frontier AI systems require structured oversight proportional to their potential impact, what is still missing is a formal mechanism to ensure that defenders systematically receive access ahead of adversaries.

Ensuring this might be worth active investment at a national level, with governments funding AI security teams, AI security research, and new regulations to ensure that the most capable commercial models are always made available to critical organisations for early security scans.

Every organization that depends on software infrastructure (which is now essentially every organization) would be wise to plan for setting up AI-driven security pipelines in the future: deploying AI agents for continuous monitoring, running simulated attacks, tracking discovered vulnerabilities, and patching them systematically before they can be exploited. [3]

In the near future, an AI Security Research Team may be as foundational as its traditional security team.

References

  1. Anthropic (2026).Assessing Claude Mythos Preview’s cybersecurity capabilitiesSource
  2. Anthropic (2026).System Card: Claude Mythos PreviewSource
  3. DeepMind (2025).Introducing CodeMender: an AI agent for code securitySource
  4. DeepMind (2025).A Framework for Evaluating Emerging Cyberattack Capabilities of AISource
  5. DARPA (2025).DARPA’s Artificial Intelligence Cyber ChallengeSource
  6. Slate (2019).When Is Technology Too Dangerous to Release to the Public?Source
Continue reading:Thinking at Machine Speed