AI Task Force Catching Hackers? The Current State of 'Multi-Agent' Technology that Hacks and Patches Itself

AI Summary

'Multi-agent' technology, where multiple AIs form a team to find software security holes and demonstrate attacks, is advancing rapidly, but limitations still exist in complex web service environments.

Imagine this. It’s early morning, even before you head to work. A warning pops up overnight that a new hacking pathway has been discovered somewhere in a messenger app used daily by people worldwide. What would it have been like in the past? Security engineers would receive emergency calls, rush to the office, chug strong coffee, and spend hours or days scrutinizing millions of lines of code, building test environments, and erecting firewalls.

But it’s different now. While humans are deep in sleep, multiple artificial intelligences (AIs) autonomously form a virtual ‘security task force.’ One AI brings up the system blueprints to devise a strategy, another AI acts as a virtual hacker to directly attack the code, and yet another AI analyzes the results in real-time. By the time you get to work, this AI task force has neatly placed a vulnerability analysis report, a flawless attack demonstration video, and even the ‘patch code’ to seamlessly resolve the issue without a trace right on your desk.

Sounds like a sci-fi movie? It’s not. This is the astonishing reality of the ‘Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction’ that global AI researchers and cybersecurity experts are fiercely building today.

Large Language Models (LLMs)—AIs trained on massive text data to understand and generate language like humans, epitomized by ChatGPT—have now evolved far beyond their roles as mere writing and drawing assistants. They are deeply penetrating the most hidden and intimate parts of computer systems, fundamentally altering the landscape of cybersecurity. So, how exactly does this AI hacker task force operate, and where does it currently stand? And why should we pay close attention to this unfamiliar technology right now?

Why It Matters

All the digital services we use like clockwork every day—smartphone apps, bank websites, online shopping malls—consist of hundreds of thousands to millions of lines of code. Metaphorically, they are like densely packed bookshelves in an endlessly vast library. Because humans manually write this massive amount of code, minor mistakes like typos inevitably occur, and hackers exploit these exact microscopic cracks to infiltrate. In the security industry, we call these security loopholes ‘Vulnerabilities’. Known security vulnerabilities are meticulously tracked and managed using a sort of criminal identification tag called ‘CVE (Common Vulnerabilities and Exposures).’

The biggest problem is the overwhelming ‘speed’ and ‘volume.’ Countless vulnerabilities pour out like a waterfall globally every day, and it takes an enormous amount of time and money for human experts to individually check them and reproduce how dangerous they actually are in real system environments. Even if a vulnerability is found, painstakingly crafting an ‘Exploit’ (executable code that actually attacks a vulnerability to prove its danger) to prove it can truly bring down a system is highly intellectual labor and a test of patience.

What if AI could entirely take over this nerve-wracking process? Human security experts would be completely freed from the tedious vulnerability verification tasks they used to perform up all night. Instead, they could focus on the architect’s role of strategizing more creative and significant defense strategies. From a corporate perspective, before a hacker can find and exploit a vulnerability, they can deploy an AI task force to proactively build a robust firewall. This is not merely a slight technological advancement. In the battle of spears and shields in the digital world, it means gaining the most powerful and tireless ‘automated shield’ at our disposal. In fact, developing software frameworks for automated vulnerability discovery in modern web applications is already being treated as a top-priority core challenge in both academia and industry [Design and Implementation of a Multi-Agent AI System for...](https://www.hse.ru/en/edu/vkr/1157694160).

The Explainer: How Does a ‘Multi-Agent’ Operate?

The core of this innovative system lies in its unique architecture: the ‘Multi-agent’ (a system where multiple AIs take on respective roles and collaborate).

Let me explain it simply with an analogy. Imagine you have to undergo a highly complex surgery, like brain surgery. Even the most brilliant genius doctor in the world cannot simultaneously administer anesthesia, wield the scalpel, and monitor blood pressure levels. For a perfect and safe surgery, the operating surgeon overseeing the entire situation, the anesthesiologist controlling the patient’s vital signs, and the scrub nurse handing over surgical tools at the right place and time must form a flawless team.

The world of AI is exactly the same. What happens if you yell at a single, reputedly smart, massive AI, “Thoroughly find all the security holes in this software and generate attack code right now!”? The volume of information to process at once becomes too overwhelming, causing it to hallucinate—making up facts that don’t exist—or get lost in a swamp of pouring data. Thus, researchers wisely introduced multi-agent systems. By orchestrating several ‘specialized agents’ dedicated exclusively to specific sub-tasks, they have enabled the resolution of complex problems that go far beyond the limits of a single agent [FuzzingBrain V2: A Multi-Agent LLM System for Automated...](https://arxiv.org/pdf/2605.21779).

Looking at actual research cases of vulnerability hunting, we can see how perfectly this operating room teamwork analogy fits. DARKNAVY, a security group that has been hunting vulnerabilities directly in the field for years, proposed a multi-agent architecture called ‘Argusee.’ Surprisingly, this system was designed to mirror the sophisticated division of labor and collaboration mechanisms within real human security teams [Argusee: A Multi-Agent Collaborative Architecture for Automated Vulnerability Discovery | DARKNAVY](https://www.darknavy.org/blog/argusee_a_multi_agent_collaborative_architecture_for_automated_vulnerability_discovery/). In other words, rather than creating a single genius hacker robot that plays a one-man band, they birthed a highly trained ‘cyber task force’ where each member has their own specialty.

A representative research case that vividly demonstrates this perfect division of roles in an AI task force is the ‘Co-RedTeam’ system. This system consists primarily of four team members who constantly converse and interact within a safely isolated execution environment [Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents](https://arxiv.org/pdf/2602.02164).

Planning: Scans the overall structure of the system and devises the big picture and detailed strategies on “where and how to strike.”
Execution: Takes the strategy from the planner, directly writes hacking code (commands), and hits the execute button. In the operating room analogy, this is the enforcer directly wielding the sharp scalpel.
Validation: Coldly checks whether the executed attack actually worked or was blocked by the system’s solid firewall, based on objective data.
Evaluation: Reviews this entire process and provides sharp feedback, asking, “Why did the attack just fail?” and “What needs to be improved for the next attack?”

An even more chilling capability of this task force is its utilization of ‘layered long-term memory.’ They do not commit the folly of meekly giving up after a single failed attempt. Inside their memory units, past vulnerability patterns discovered, highly refined hacking strategies, and specific technical measures are stored intact, just like a veteran detective’s case notebook. They remember painful past failures or exhilarating successes, smartly reusing them in subsequent missions, creating a structure that infinitely evolves on its own [Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents](https://arxiv.org/pdf/2602.02164).

Furthermore, another framework study named ‘CVE-Genie’ takes this one step further. They newly defined five core attributes called ‘EAGER’ that an ideal vulnerability reproduction system should possess. This goes well beyond simply writing attack code. It runs towards the grand goal of generating ‘end-to-end automated’ Proof of Concept (PoC) code based on universal capabilities across various programming languages and projects, where the AI autonomously rebuilds vulnerable environments identically and establishes validators [From CVE Entries to Verifiable Exploits: An Automated Multi-Agent Framework for Reproducing CVEs](https://arxiv.org/html/2509.01835v1).

Where We Stand: The Birth of the Perfect Hacker?

So, could this formidable AI task force render all hackers and security experts worldwide unemployed as early as tomorrow? To give you a very clear conclusion right upfront: “We still have a long way to go.”

Recently, researchers summoned the most capable AI agent models available today (OpenHands, SWE-agent, CAI, etc.) into the ring and subjected them to a brutal test. They ran a benchmark test assembling a staggering 80 real-world web vulnerabilities (CVE) spanning 7 vulnerability types and 6 modern web technologies [[2510.14700] LLM Agents for Automated Web Vulnerability Reproduction: Are We There Yet?](https://arxiv.org/abs/2510.14700). This test was a very cold evaluation stage asking whether these cutting-edge AIs, which had been like hothouse plants inside labs, could actually showcase their skills in the ‘complex software environments of the real world’ swept by wind and rain [LLM Agents and Web Vulnerability Reproduction | ShortSpan.ai](https://shortspan.ai/llm-agents-struggle-to-reproduce-web-vulnerabilities.html).

The test results laid bare the distinct limitations of artificial intelligence. Fortunately, the AI agents showed a fairly plausible success rate when it came to reproducing ‘simple vulnerabilities’ tightly hidden within specific libraries (small collections of code acting as software components).

As an analogy, it means they performed the single mission of ‘picking a single old, broken padlock hanging on a barn in a remote neighborhood’ very excellently. This is because the target clearly enters their line of sight, and there is only one hole to pierce right away, keeping the scope of the problem to be solved very narrow.

However, the real headache lies in the fact that most modern web services are by no means as simple as a neighborhood barn. The smartphone apps or shopping malls we casually tap on are like giant high-tech buildings where countless components—invisible backend servers, massive databases, and intricately woven login authentication systems—interlock and spin like cogwheels behind the flashy screens we see.

According to the research results, these previously smart LLM agents were shown to consistently fail structurally when faced with ‘complex service-based vulnerabilities’ in multi-component environments where multiple elements operate simultaneously [[2510.14700] LLM Agents for Automated Web Vulnerability Reproduction: Are We There Yet?](https://arxiv.org/abs/2510.14700).

Let’s compare this chaotic situation to a movie. Think of the movie Ocean’s Eleven. For the protagonist’s crew to rob an ironclad casino vault, they need a highly complex operation: one person cuts the power from the basement, while another simultaneously distracts the guards, and yet another inputs a fake fingerprint at the precise timing.

But current AI is like panicking and floundering in front of such a complex operation, not knowing what to do first. AI struggles to maintain a long ‘context’ of tasks, losing its way in the complex thinking process of bringing up operating records (logs) from multiple scattered servers simultaneously and deducing their correlations. It still falls far short of catching up with the deep intuition and extensive experience of human security experts, who can see through the entire massive system and find the connecting links, beyond merely fixing a single component.

What’s Next

Even if the current AI task force exhibits novice behavior, holding the map upside down and wandering in front of a giant casino building, considering the dazzling pace of AI technological advancement, there is a high probability that these limitations will be overcome in the near future. Then, what exactly is the next-stage goal for multi-agent systems once they broaden their horizons and become even smarter?

Security experts are offering clues to this compelling answer through next-generation multi-agent LLM system research like ‘FuzzingBrain V2’. Researchers strongly expect that when cutting-edge ‘long-context’ LLM technology—capable of reading and remembering tens of thousands of books at once—is introduced, agents will maintain unshakeable logic without losing focus even during long and tedious analysis sessions lasting for days [FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction](https://arxiv.org/html/2605.21779v1).

However, there is a separate ultimate developmental direction that makes the hearts of the global security industry beat the fastest. That is ‘Automatic patch generation.’

If AI agents so far have merely stopped at the diagnostic role of ‘finding and stealthily poking at’ security holes like spies, the leap to the next stage is to independently weave sturdy cement to perfectly fill the holes of discovered vulnerabilities—that is, writing patch code. By analogy, it’s like a great security guard not stopping at catching a thief, but immediately calling a locksmith and a carpenter on the spot to figure out and install stronger new doors and locks.

If AI successfully generates Proof of Concept (PoC) code, it means the AI has already perfectly grasped the root cause of the vulnerability to the bone. Having identified the cause, the AI will now take complete charge of the entire process of autonomously generating the perfect fixes to plug the hole, and further verifying whether the solution operates safely without breaking other parts of the system. In other words, the dominant and common consensus in current academia is that AI will eventually complete the vulnerability lifecycle on its own, from the moment it is born to when it is buried [FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction](https://arxiv.org/html/2605.21779v1). [[2605.21779] FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction](https://arxiv.org/abs/2605.21779).

In the near future when this technology settles into our daily lives, the landscape of cybersecurity will be completely transformed. It will no longer be an analog, sweaty game of hide-and-seek between an ‘AI hacker’ behind a computer and a ‘human defender’ holding a shield, as in the past. It will morph into a truly overwhelming, gigantic automated chessboard where an ‘AI defense squad’ holding a steel shield and an ‘AI attacker’ holding a sharp spear repeat virtual battles millions of times per second, piercing each other’s weak points and infinitely evolving themselves.

AI’s Take

Cybersecurity in the past was a highly analog, exhausting game of hide-and-seek between a few genius hackers in hoodies in dark rooms and corporate defenders trying to stop their stealthy intrusions. However, the dazzling emergence of multi-agent systems is transforming this tedious hide-and-seek into a massive ‘automated defense factory’ operating with a roar 24/7 without a single second of rest.

Of course, the AI of today might just be a rookie task force lost and wandering in front of the intricately tangled architecture of modern colossal software buildings. But their terrifying potential to autonomously find security gaps, devise sophisticated hacking operations, and ultimately weave the perfect vaccine code to stitch up those deep wounds will undoubtedly shake the topography of IT technology to its roots in the near future.

Such astonishing changes also demand a massive philosophical shift in the role of human developers. If developers of the past were simply ‘coders’ laying bricks (code) by hand one by one, we in the future must be reborn as ‘orchestra commanders’ coordinating complex AI task force teams and issuing clear directives. In the not-so-distant tomorrow, the safe survival of companies and society might depend entirely not on what flashy new services are ‘developed,’ but on how exquisitely we organize and rigorously train the ‘AI task force that protects’ the systems we already have.

References

[FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction](https://arxiv.org/html/2605.21779v1)
[[2605.21779] FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction](https://arxiv.org/abs/2605.21779)
[Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents](https://arxiv.org/pdf/2602.02164)
[[2510.14700] LLM Agents for Automated Web Vulnerability Reproduction: Are We There Yet?](https://arxiv.org/abs/2510.14700)
[From CVE Entries to Verifiable Exploits: An Automated Multi-Agent Framework for Reproducing CVEs](https://arxiv.org/html/2509.01835v1)
[Argusee: A Multi-Agent Collaborative Architecture for Automated Vulnerability Discovery | DARKNAVY](https://www.darknavy.org/blog/argusee_a_multi_agent_collaborative_architecture_for_automated_vulnerability_discovery/)
[FuzzingBrain V2: A Multi-Agent LLM System for Automated...](https://arxiv.org/pdf/2605.21779)
[Design and Implementation of a Multi-Agent AI System for...](https://www.hse.ru/en/edu/vkr/1157694160)
[LLM Agents and Web Vulnerability Reproduction | ShortSpan.ai](https://shortspan.ai/llm-agents-struggle-to-reproduce-web-vulnerabilities.html)

Share this article:

Test Your Understanding

Q1. Which of the following is the most appropriate analogy for a 'multi-agent' system?

An all-around tutor who teaches all subjects alone
An operating room team where surgeons, anesthesiologists, and nurses each take on roles and collaborate
A simple calculator that repeatedly calculates only input numbers

A multi-agent system is not a single AI handling everything, but a system where multiple AIs with different specialties divide roles and collaborate.

Q2. What kind of vulnerabilities did AI agents struggle to find in recent studies?

Simple library-based vulnerabilities
Old vulnerabilities perfectly resolved in the past
Service-based vulnerabilities intertwined with multiple complex components

AI agents are good at finding simple vulnerabilities but tend to fail structurally in complex multi-component environments where various web technologies and systems are intertwined.

Q3. What do researchers point to as the ultimate future direction for security vulnerability systems?

Going beyond simply discovering vulnerabilities to automatically generating patches (solutions)
AI dominating the world by completely excluding humans
Replacing all security systems with physical locks

Researchers see the next step as AI completing the entire vulnerability lifecycle by identifying the root cause of a vulnerability, automatically generating a patch to fix it, and verifying it.