Tuesday, August 5, 2025

"AI LLMs are now so clever that they can independently plan and execute cyberattacks without human intervention..."

From TechRadar, August 2:

AI model replicated the Equifax breach without a single human command 

  • Researchers recreated the Equifax hack and watched AI do everything without direct control
  • The AI model successfully carried out a major breach with zero human input
  • Shell commands weren’t needed, the AI acted as the planner and delegated everything else 

Large language models (LLMs) have long been considered useful tools in areas like data analysis, content generation, and code assistance.

However, a new study from Carnegie Mellon University, conducted in collaboration with Anthropic, has raised difficult questions about their role in cybersecurity.

The study showed that under the right conditions, LLMs can plan and carry out complex cyberattacks without human guidance, suggesting a shift from mere assistance to full autonomy in digital intrusion.

From puzzles to enterprise environments
Earlier experiments with AI in cybersecurity were mostly limited to “capture-the-flag” scenarios, simplified challenges used for training.

The Carnegie Mellon team, led by PhD candidate Brian Singer, went further by giving LLMs structured guidance and integrating them into a hierarchy of agents.

With these settings, they were able to test the models in more realistic network setups.

In one case, they recreated the same conditions that led to the 2017 Equifax breach, including the vulnerabilities and layout documented in official reports.

The AI not only planned the attack but also deployed malware and extracted data, all without direct human commands....

....MUCH MORE