Carnegie Mellon Researchers Demonstrate That LLMs Can Autonomously Plan and Execute Real-World Cyberattacks

24 Jul 2025

PITTSBURGH

In a major advance in the fields of cybersecurity and artificial intelligence, researchers from Carnegie Mellon University, in collaboration with Anthropic, have demonstrated that large language models (LLMs) can autonomously plan and execute sophisticated cyberattacks on enterprise-grade network environments without human intervention.

The study, led by Ph.D. candidate Brian Singer from Carnegie Mellon's Department of Electrical and Computer Engineering, reveals that LLMs, when structured with high-level planning capabilities and supported by specialized agent frameworks, can simulate network intrusions that closely mirror real-world breaches. The study’s most striking finding: an LLM was able to successfully replicate the infamous 2017 Equifax data breach in a controlled research environment—autonomously exploiting vulnerabilities, installing malware, and exfiltrating data.

“Our research shows that with the right abstractions and guidance, LLMs can go far beyond basic tasks,” said Singer. “They can coordinate and execute attack strategies that reflect real-world complexity.”

The team developed a hierarchical architecture where the LLM acts as a strategist, planning the attack and issuing high-level instructions, while a mix of LLM and non-LLM agents carry out low-level tasks like scanning networks or deploying exploits. This approach proved far more effective than earlier methods, which relied solely on LLMs executing shell commands.

This work builds on Singer’s prior research into making autonomous attacker and defender tools more accessible and programmable for human developers. Ironically, the same abstractions that simplified development for humans made it easier for LLMs to autonomously perform similar tasks.

While the findings are groundbreaking, Singer emphasized that the research remains a prototype.

“This isn’t something that’s going to take down the internet tomorrow,” he said. “The scenarios are constrained and controlled—but it’s a powerful step forward.”

The implications are twofold: the research highlights serious long-term safety concerns about the potential misuse of increasingly capable LLMs, but it also opens up transformative possibilities for defensive cybersecurity.

“Today, only large organizations can afford red team exercises to proactively test their defenses,” Singer explained. “This research points toward a future where AI systems continuously test networks for vulnerabilities, making these protections accessible to small organizations too.”

The project was conducted in collaboration with Anthropic, which provided model credits and technical consultation. The team included CMU students and faculty affiliated with CyLab, the university’s security and privacy institute. An early version of the research was presented at an OpenAI-hosted security workshop in May.

The resulting paper, “On the Feasibility of Using LLMs to Autonomously Execute Multi-host Network Attacks,” has been cited in multiple industry reports and is already informing safety documentation for cutting-edge AI systems. Lujo Bauer and Vyas Sekar, co-directors of CMU’s Future Enterprise Security Initiative, served as faculty advisors for the project.

Looking ahead, the team is now studying how similar architectures might enable autonomous AI defenses, exploring scenarios where LLM-based agents detect and respond to attacks in real time.

“We're entering an era of AI versus AI in cybersecurity,” Singer said. “And we need to understand both sides to stay ahead.”

About the College of Engineering: The College of Engineering at Carnegie Mellon University is a top-ranked engineering college that is known for our intentional focus on cross-disciplinary collaboration in research. The College is well-known for working on problems of both scientific and practical importance. Our “maker” culture is ingrained in all that we do, leading to novel approaches and transformative results. Our acclaimed faculty have a focus on innovation management and engineering to yield transformative results that will drive the intellectual and economic vitality of our community, nation, and world.

About CyLab: CyLab is the university-wide security and privacy institute at Carnegie Mellon University. We coordinate security and privacy research and education across all university departments. Our mission is to catalyze, support, promote, and strengthen collaborative security and privacy research and education across departments, disciplines, and geographic boundaries to achieve significant impact on research, education, public policy, and practice.

View source version on businesswire.com: https://www.businesswire.com/news/home/20250724351815/en/

© Business Wire, Inc.

Disclaimer :
This press release is not a document produced by AFP. AFP shall not bear responsibility for its content. In case you have any questions about this press release, please refer to the contact person/entity mentioned in the text of the press release.

Previous Back to summary Next