When Humans and AI Hack Together

Author: Denis Avetisyan

A new study examines how collaborative cybersecurity teams-pairing human experts with artificial intelligence-perform in competitive hacking challenges.

User engagement progresses through a defined workflow, facilitating participation and contribution.

Research reveals the benefits and limitations of AI assistance in Capture The Flag (CTF) competitions, emphasizing the critical role of human guidance in vulnerability discovery and exploit generation.

While artificial intelligence increasingly demonstrates proficiency in technical security tasks, its effective integration with human expertise remains a critical challenge. This paper, ‘Understanding Human-AI Collaboration in Cybersecurity Competitions’, investigates this interplay through an empirical study of human teams collaborating with AI agents during live Capture-The-Flag (CTF) competitions. Our findings reveal that successful problem-solving is often constrained not by the AI’s reasoning capabilities, but by the human player’s ability to effectively prompt and guide the AI-a limitation overcome by fully autonomous agents who outperformed most human-AI teams. How can we design future CTF challenges and human-in-the-loop systems to maximize the synergistic potential of human intuition and AI automation for enhanced cybersecurity?

The Evolving Landscape of Vulnerability Research

Historically, identifying software vulnerabilities has been a painstaking process, largely dependent on skilled security analysts meticulously dissecting code by hand or employing limited automated tools. This manual approach is inherently slow, requiring significant time investment to uncover even a single weakness, and proves exceptionally expensive due to the need for highly trained personnel. However, the escalating complexity of modern software – with millions of lines of code, intricate dependencies, and frequent updates – now far outstrips the capacity of manual analysis to keep pace. Consequently, vulnerabilities are often discovered after they’ve been exploited in the wild, leading to costly breaches and significant damage – a situation that underscores the urgent need for more efficient and scalable vulnerability discovery methods.

The escalating frequency and complexity of contemporary cyberattacks necessitate a fundamental shift in how security vulnerabilities are identified and mitigated. Traditional, manual methods of vulnerability research are increasingly insufficient, struggling to keep pace with the sheer volume of code and the ingenuity of malicious actors. Modern attacks, often leveraging zero-day exploits and advanced persistent threats, demand proactive, scalable solutions capable of automatically detecting weaknesses before they can be exploited. Consequently, there’s a growing impetus to develop automated tools and techniques that can analyze vast codebases, identify potential flaws, and prioritize remediation efforts – moving beyond reactive patching to a more preventative security posture. This transition is critical for organizations seeking to maintain a robust defense against increasingly sophisticated threats and minimize the potential for damaging breaches.

The application of Large Language Models to vulnerability discovery represents a significant, though nuanced, advancement in cybersecurity. These models, trained on vast datasets of code, can identify patterns and anomalies suggestive of security flaws with increasing accuracy – potentially automating tasks previously requiring extensive manual effort. However, simply deploying an LLM isn’t sufficient; careful orchestration is crucial. Effective implementation demands precise prompting strategies to guide the model’s analysis, robust filtering mechanisms to minimize false positives, and integration with existing security workflows. Furthermore, LLMs aren’t infallible and can be susceptible to adversarial attacks or exhibit biases present in their training data, necessitating continuous monitoring and human oversight to validate findings and prevent exploitation. The true power lies not in replacing security experts, but in augmenting their capabilities with AI-driven tools that accelerate the identification and remediation of vulnerabilities.

The escalating complexity of modern software and the increasing sophistication of cyber threats are driving a necessary evolution in vulnerability research. While human expertise remains crucial for nuanced analysis and contextual understanding, it is becoming increasingly clear that relying solely on manual methods is unsustainable. Consequently, the field is witnessing a growing integration of artificial intelligence and machine learning tools designed to augment, not replace, human capabilities. These AI-driven techniques automate repetitive tasks like code scanning and pattern recognition, allowing researchers to focus on more complex issues and accelerate the identification of potential weaknesses. This synergistic approach-combining the analytical power of AI with the critical thinking of human experts-represents a pivotal shift towards a more proactive and scalable security posture, ultimately enhancing the resilience of software systems against emerging threats.

Orchestrating Intelligence: AI Agent Frameworks

AI agent frameworks, including ENIGMA, CRAKEN, and Cybench, establish the foundational infrastructure required to integrate Large Language Model (LLM) reasoning capabilities with external tools and dynamic environments. These frameworks facilitate the connection between an LLM’s analytical processing and actionable outputs, allowing it to utilize software tools – such as APIs, scripting languages, or specialized software – to interact with and manipulate its surrounding environment. This interaction isn’t limited to digital spaces; frameworks can also enable LLMs to control physical systems via connected devices or simulations, effectively extending the LLM’s operational scope beyond purely textual data processing and enabling complex, autonomous task execution.

AI agent frameworks facilitate the execution of complex tasks by allowing Large Language Models (LLMs) to interact directly with external systems and environments. Unlike traditional LLMs limited to text input and output, these frameworks enable LLMs to utilize tools – such as network scanners, web browsers, or code interpreters – to gather information, perform actions, and observe the resulting changes within a simulated or real-world context. This capability allows for the creation of realistic attack scenarios, where the LLM, functioning as an agent, can autonomously probe systems, exploit vulnerabilities, and attempt to achieve specified objectives, moving beyond passive text analysis to active, goal-oriented behavior.

Prompt engineering is a critical element in AI agent frameworks as it directly influences the Large Language Model’s (LLM) actions and overall performance. Effective prompts define the agent’s goals, constraints, and the desired format for outputs, thereby shaping its interaction with tools and the environment. This process involves careful crafting of instructions, including the use of few-shot examples or chain-of-thought reasoning techniques, to guide the LLM towards successful task completion. The quality of the prompt significantly impacts the agent’s ability to correctly interpret requests, select appropriate tools, and generate valid, actionable responses; poorly designed prompts can lead to irrelevant actions, errors, or a failure to achieve the intended objective. Consequently, iterative prompt refinement and optimization are essential components of developing robust and reliable AI agents.

AI agent framework performance is a function of both the capabilities of the Large Language Model (LLM) employed and the architecture of the agent’s interaction loop. While a more powerful LLM generally yields improved results due to enhanced reasoning and language understanding, this potential is constrained by the agent’s design. The interaction loop, encompassing observation, planning, and action execution, dictates how effectively the LLM’s reasoning is translated into tangible results within a given environment. Specifically, loop efficiency-measured by the speed and accuracy of observation processing and the precision of action selection-directly impacts overall performance. A poorly designed loop can introduce bottlenecks, misinterpret environmental feedback, or lead to suboptimal action sequences, even when utilizing a state-of-the-art LLM. Therefore, optimization of both the LLM and the interaction loop is critical for maximizing the efficacy of AI agent frameworks.

Representative prompts demonstrate the diversity of inputs used to guide the model's behavior. — Representative prompts demonstrate the diversity of inputs used to guide the model’s behavior.

Validating Intelligence: Human-AI Synergy in Practice

Capture The Flag (CTF) competitions serve as a standardized and challenging evaluation platform for assessing the capabilities of AI agents in cybersecurity. These competitions present participants with a variety of deliberately vulnerable systems and require them to identify and exploit vulnerabilities to retrieve hidden “flags.” The multi-faceted nature of CTF challenges-encompassing areas such as web exploitation, reverse engineering, cryptography, and forensics-provides a comprehensive testbed, simulating real-world security scenarios. Utilizing publicly available CTF datasets and scoring systems allows for objective comparison of AI agent performance against established human expert benchmarks, facilitating measurable progress in autonomous security capabilities and providing a consistent metric for research and development.

Successful performance of AI agents in complex security scenarios is heavily reliant on robust Tool Interaction capabilities. These capabilities encompass not only the ability to correctly invoke tools – such as debuggers, disassemblers, and network analyzers – but also to effectively chain tool outputs as inputs for subsequent actions. Research indicates that agents lacking proficient tool interaction struggle to progress beyond initial reconnaissance phases, failing to exploit vulnerabilities identified through analysis. Specifically, effective agents demonstrate an ability to parse tool outputs, extract relevant information, and dynamically adjust tool selection based on the evolving security landscape, enabling them to autonomously navigate complex challenges that require iterative analysis and exploitation.

Performance in cybersecurity tasks, for both human experts and artificial intelligence agents, is directly correlated with the depth and relevance of domain-specific knowledge. Specialized training and contextual understanding enable more efficient problem-solving and accurate identification of vulnerabilities. AI agents lacking sufficient domain knowledge exhibit reduced efficacy in complex scenarios, mirroring the limitations observed in human analysts unfamiliar with specific systems or attack vectors. Conversely, both groups demonstrate significantly improved results when provided with targeted datasets and pre-trained models focused on relevant security principles, exploit techniques, and system architectures. This underscores the necessity of incorporating robust knowledge representation and continuous learning mechanisms into both human training programs and AI agent development.

Recent research indicates that advanced autonomous agents are demonstrating performance levels exceeding those of most human teams in Capture The Flag (CTF) competitions. Specifically, these agents achieved a score of 4900, positioning them second among the top ten human teams. Furthermore, the agents completed the competition in approximately 20% of the time required by the human teams, indicating a significant efficiency advantage. These results suggest a potential for AI to substantially augment or even surpass human capabilities in certain cybersecurity challenge scenarios.

Sonnet-4.5 ([latex]S[/latex]), Opus-4.1 ([latex]O[/latex]), and Haiku-3.5 ([latex]H[/latex]) agents achieved competitive scores, rivaling those of the top ten human teams.

The Future of Automated Security Analysis

Recent evaluations of autonomous agents within Capture The Flag (CTF) competitions are providing compelling evidence for the viability of AI-driven security analysis. These aren’t simple, controlled tests; CTFs simulate realistic cyberattack scenarios, demanding agents independently identify and exploit vulnerabilities, often against actively defending systems. Successful performance in such complex environments demonstrates an agent’s capacity to move beyond identifying known signatures and towards genuine problem-solving-a crucial step for proactive threat detection. The ability to consistently achieve high scores, or even compete with human security experts, validates the potential for these AI systems to augment, and eventually automate, significant portions of the cybersecurity workflow, shifting the focus from reactive incident response to preventative security measures.

Successful deployment of artificial intelligence in cybersecurity isn’t simply about creating powerful agents; it hinges on their smooth integration into pre-existing security operations. Current security information and event management (SIEM) systems, vulnerability scanners, and incident response platforms represent substantial investments for organizations, and a disruptive overhaul isn’t practical or desirable. Therefore, AI agents must function as force multipliers, augmenting human analysts and automating repetitive tasks within these established workflows. This necessitates the development of APIs and standardized interfaces that allow AI to ingest data from existing tools, present findings in a readily understandable format, and seamlessly execute actions – such as triaging alerts or initiating automated remediation – without requiring significant changes to established processes. Only through such interoperability can organizations realize the full potential of AI-driven security, moving beyond isolated experiments to achieve widespread, practical adoption and a genuinely improved security posture.

The efficacy of automated vulnerability discovery is increasingly tied to advancements in how artificial intelligence agents are instructed and equipped. Current research emphasizes that simply providing an AI with access to security tools isn’t enough; the quality of the prompts – the specific instructions given – dramatically impacts performance. Sophisticated prompt engineering techniques, including few-shot learning and chain-of-thought reasoning, are enabling agents to not only utilize tools like fuzzers and static analyzers, but to strategically combine them and interpret results with greater accuracy. Moreover, investigations into how AI agents interact with multiple tools simultaneously – orchestrating complex workflows and adapting based on intermediate findings – are revealing pathways to automate tasks previously requiring significant human expertise. These ongoing studies suggest that refining the communication interface between AI and security tools holds the key to unlocking a new generation of proactive, self-improving vulnerability discovery systems.

The convergence of artificial intelligence and human expertise promises a paradigm shift in cybersecurity, moving beyond reactive measures toward proactive threat anticipation. This collaborative future envisions AI agents continuously monitoring systems, identifying anomalies, and prioritizing potential vulnerabilities, while human analysts focus on complex investigations, strategic decision-making, and refining AI models. Such a synergy isn’t about replacing security professionals, but rather augmenting their capabilities, allowing them to address a greater volume of threats with increased speed and precision. By automating repetitive tasks and providing intelligent insights, this human-AI partnership aims to significantly bolster overall cybersecurity posture, reducing the window of opportunity for malicious actors and enabling a more resilient defense against increasingly sophisticated cyberattacks. The resulting security operations centers will be empowered to not only respond to incidents, but to actively hunt for threats and preemptively mitigate risks.

The study illuminates a crucial dynamic: effective human-AI collaboration isn’t about replacing human intellect, but augmenting it. This echoes Alan Turing’s sentiment: “It is possible to build a machine that can carry out any operation which could be done by a human being.” The research demonstrates that while AI agents excel at tasks like vulnerability discovery, they require human direction to prioritize efforts and validate findings. Clarity is the minimum viable kindness; the paper reveals that the true power lies in a streamlined interface allowing humans to efficiently guide AI’s computational strengths, ultimately enhancing overall performance in complex cybersecurity challenges. The limitations observed underscore that AI, despite its capabilities, remains a tool-its effectiveness contingent upon astute human oversight.

Where Do We Go From Here?

The exercise reveals, predictably, that a system requiring human oversight is, at best, halfway to resolution. The value isn’t in what the Large Language Model adds to vulnerability discovery, but in what human direction removes from the search space. The observed benefit isn’t automation, but curation – a filter, not an engine. One anticipates, then, a shift in emphasis. The question isn’t how to build an AI that solves CTF challenges, but how to build one that doesn’t waste a human’s time on the unsolvable.

The limitations are stark. The AI, despite its facility with language, remains dependent on pre-existing knowledge – a parrot, expertly mimicking insight. True innovation demands a departure from the known, a capacity currently beyond its reach. Future work must address this fundamental constraint, perhaps by exploring methods for inducing, or at least simulating, genuine curiosity. A system that needs to be told what is interesting has already failed to grasp the core principle.

Ultimately, the pursuit of ‘AI assistance’ feels misdirected. The goal should not be to create a digital partner, but to build tools that diminish the need for expertise. Clarity is, after all, courtesy. A perfect system wouldn’t require collaboration; it would simply be the solution. That, of course, is a distant ambition. But it is toward such austere simplicity that the field must strive.

Original article: https://arxiv.org/pdf/2602.20446.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Evolving Landscape of Vulnerability Research

Orchestrating Intelligence: AI Agent Frameworks

Validating Intelligence: Human-AI Synergy in Practice

The Future of Automated Security Analysis

Where Do We Go From Here?

See also: