Automated Red Teaming: AI Agents Secure Operational Networks

Author: Denis Avetisyan

This review details a novel workflow employing artificial intelligence to autonomously assess and strengthen the cybersecurity of critical industrial systems.

The agentic penetration testing workflow systematically integrates task planning, execution, memory preservation, and information retrieval to achieve comprehensive vulnerability assessment.

A multi-agent system leveraging large language models and a graph-based memory system automates penetration testing of ROS-based operational technology networks, improving traceability and regulatory compliance.

As digital infrastructures grow increasingly complex, ensuring their security through scalable and reliable methods remains a significant challenge. This is addressed in ‘Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing’, which proposes a novel multi-agent system leveraging large language models to automate penetration testing of robotic operational technology networks. The approach utilizes a shared, graph-based memory to dynamically capture system state, achieving 100% success in a specialized robotics Capture-the-Flag scenario while maintaining crucial traceability for regulatory compliance. Could this environment-grounded architecture represent a paradigm shift towards more robust and auditable cybersecurity for increasingly interconnected robotic systems?

The Inherent Limitations of Manual Penetration Testing

Conventional penetration testing relies heavily on the expertise of highly trained security professionals who meticulously probe systems and networks for weaknesses. This process is inherently manual, demanding significant time and resources as each assessment requires detailed planning, execution, and analysis. Because of the need for specialized knowledge and the labor-intensive nature of the work, organizations often face limitations in the frequency and scope of their security evaluations. Consequently, vulnerabilities can persist undetected for extended periods, increasing the risk of exploitation and potential breaches. The demand for skilled penetration testers consistently outpaces supply, creating a bottleneck in proactive security measures and highlighting the need for innovative solutions that can augment or even automate aspects of this critical process.

Modern network infrastructure, characterized by dynamic cloud environments, microservices, and a proliferation of interconnected devices, presents a significant challenge to traditional security assessments. While automated penetration testing tools offer a potential solution to the increasing scale and speed of deployments, current offerings frequently struggle with adaptability and robustness. These tools often rely on predefined scripts and signatures, proving ineffective against novel vulnerabilities or complex, customized systems. A lack of intelligent navigation and contextual understanding limits their ability to effectively explore network topologies, bypass security mechanisms, and accurately identify exploitable weaknesses, ultimately requiring substantial manual intervention and diminishing the benefits of automation. Consequently, a critical need exists for more sophisticated systems capable of autonomously adapting to evolving network landscapes and delivering reliable, comprehensive vulnerability assessments.

The escalating intricacy of modern digital infrastructure creates a pressing need for penetration testing systems that move beyond scripted automation. Current tools often struggle with dynamic network configurations and novel attack surfaces, highlighting the limitations of pre-defined test cases. Consequently, research is increasingly focused on developing intelligent, autonomous agents capable of independently exploring network topologies, adapting to unforeseen circumstances, and proactively identifying vulnerabilities without explicit human guidance. These systems leverage techniques such as reinforcement learning and adversarial AI to mimic the behavior of skilled penetration testers, continuously refining their strategies and discovering previously unknown weaknesses – a crucial step toward bolstering cybersecurity in an ever-evolving threat landscape.

This Docker network illustrates the architecture for a robot-based manufacturing use case, facilitating isolated testing and deployment of its components.

A Multi-Agent System for Autonomous Penetration Testing

The system architecture utilizes a Multi-Agent System, a computational paradigm where autonomous agents collaborate to achieve a collective goal. Orchestration is managed by LangGraph, a framework designed for building applications with language models and graphs. This approach enables parallelized task execution, significantly reducing overall testing time by distributing workloads across multiple agents simultaneously. Furthermore, the system exhibits dynamic adaptation; agents can adjust their strategies and priorities based on real-time feedback and evolving information, allowing it to respond effectively to changing network conditions and discovered vulnerabilities during the penetration testing process.

The system employs three core agent types to facilitate penetration testing. The Planner Agent is responsible for generating a sequence of tasks designed to achieve defined objectives. The Executor Agent then carries out these tasks, utilizing external tools such as network scanners like Nmap and scripting capabilities via Bash Scripts to interact with the target system. Finally, the Memory Agent focuses on retaining and refining information gathered during the process, creating a persistent knowledge base used to inform future task planning and adapt to the evolving state of the target environment.

The Memory Agent maintains a persistent knowledge base, termed GraphMemory, which functions as the system’s long-term memory. This GraphMemory stores discovered information about the target system, including network topology, open ports, identified services, and results of executed commands. Critically, the Memory Agent provides this historical data and contextual information to the Planner Agent, enabling it to refine subsequent task generation. This iterative process allows the penetration testing framework to avoid redundant scans, prioritize potentially vulnerable areas, and adapt its attack strategy based on previously acquired knowledge, significantly improving the efficiency and effectiveness of the testing process.

Operational Workflow and System Dynamics

The system’s architecture is predicated on the use of predefined Operational Modes, which strictly regulate the actions available to each agent within the multi-agent workflow. These modes serve as a control mechanism, ensuring that agents operate within defined boundaries and contribute to the overall system goals in a coordinated manner. By limiting permissible actions based on the current operational mode, the system minimizes conflicting behaviors and promotes a predictable, structured execution of tasks. This approach enhances reliability and facilitates debugging, as agent behavior is constrained and therefore more easily traceable and verifiable.

The State variable functions as a central repository for all system data, including current task assignments, progress tracking, and relevant operational information. This shared memory architecture allows each agent within the multi-agent system to access and update a consistent view of the system’s status. Specifically, agents utilize the State variable to determine task dependencies, avoid redundant operations, and coordinate actions with other agents, facilitating a streamlined and coherent workflow. Updates to the State variable are managed to ensure data integrity and prevent conflicts between concurrent agent operations.

The system’s `Planner Agent` utilizes large language models (LLMs) to facilitate advanced task decomposition and reasoning capabilities. Currently implemented LLMs include `Llama-3.3-70b-instruct`, `Deepseek-v3.2`, `Gemma-3-27b-it`, and `Hermes-2-pro-llama-3-8b`. These models are responsible for generating a sequence of actionable tasks based on the system’s objectives and the current `State`. The LLMs’ ability to understand complex scenarios and formulate appropriate responses is crucial for navigating the operational environment and achieving successful task completion, particularly within the context of the defined `Operational Modes`.

Recent evaluations conducted within a ROS-based Operational Technology (OT) network demonstrate the superior performance of our multi-agent workflow compared to existing benchmark systems in a Capture The Flag (CTF) exercise. The workflow successfully completed all reconnaissance tasks – specifically CTF-0 and CTF-1 – mirroring the performance of all evaluated models. Critically, however, our workflow, in conjunction with the llama-3.3-70b-instruct LLM, was the only tested configuration to achieve successful completion of the more complex CTF-2 and CTF-3 exploitation tasks, indicating a significant advancement in automated OT network security assessment capabilities.

During evaluation within a ROS-based Operational Technology (OT) network simulating a Capture The Flag (CTF) exercise, the multi-agent workflow demonstrated differentiated performance across tasks. While all evaluated models achieved 100% completion of the initial reconnaissance tasks, CTF-0 and CTF-1, success rates decreased significantly on the subsequent exploitation challenges, CTF-2 and CTF-3. Notably, only the proposed workflow, in conjunction with the Llama-3.3-70b-instruct LLM, successfully completed both CTF-2 and CTF-3, indicating a substantial performance advantage in more complex, exploitation-focused scenarios compared to all other tested models.

Future Trajectories and Regulatory Considerations

The system’s modular architecture is intentionally designed for future adaptability, recognizing that the landscape of cybersecurity is in constant flux. This allows for the seamless incorporation of novel security tools and techniques as they emerge, without requiring a fundamental overhaul of the existing framework. Researchers can readily integrate updated vulnerability scanners, refined threat intelligence feeds, or advanced mitigation strategies, ensuring the system remains effective against evolving threats. This extensibility isn’t merely about adding new components; it’s about fostering a dynamic security posture, capable of proactively addressing future challenges and maintaining a robust defense against increasingly sophisticated attacks.

The increasing deployment of autonomous systems demands careful consideration of emerging regulatory frameworks, most notably the stipulations within the EU AI Act. This legislation underscores the critical need for transparency in algorithmic decision-making, ensuring that the rationale behind an autonomous system’s actions is readily understandable and auditable. Equally important is traceability, which requires a clear record of data provenance and processing steps, enabling effective identification and mitigation of potential biases or errors. Ultimately, robust human oversight remains paramount; systems must be designed to allow for meaningful human intervention and control, preventing unintended consequences and fostering public trust in these increasingly pervasive technologies. Adherence to these principles is not merely a legal requirement, but a fundamental step towards responsible innovation and the safe integration of autonomous systems into society.

Continued development centers on fortifying the system against adversarial conditions and refining its accuracy. Current efforts prioritize minimizing false positive identifications, a common challenge in automated security tools, through advanced filtering and contextual analysis techniques. A key objective is the automation of vulnerability validation, moving beyond simple detection to provide confirmed and actionable intelligence; this involves integrating automated exploitation frameworks and incorporating feedback loops to continuously improve the system’s discernment. Ultimately, these enhancements aim to create a more reliable and trustworthy autonomous security solution capable of operating effectively in dynamic and unpredictable environments.

The system’s operational capacity within intricate and dynamic environments is significantly bolstered by its integration with Data Distribution Service (DDS) through the Robot Operating System 2 (ROS2) framework. DDS facilitates reliable, real-time data exchange, crucial for coordinating actions and maintaining system integrity when faced with noisy sensor data or intermittent connectivity. This communication paradigm allows for decentralized operation, diminishing single points of failure and enabling the system to scale efficiently as complexity increases. By leveraging DDS’s quality-of-service features, the architecture dynamically adapts to network conditions and prioritizes critical information, ensuring robust performance even in challenging real-world deployments where consistent communication is paramount for maintaining situational awareness and executing autonomous functions.

The pursuit of automated penetration testing, as detailed in this work, echoes a fundamental mathematical principle: the reduction of complex systems to provable components. This research prioritizes traceability and regulatory alignment, aiming for a demonstrably correct workflow-not merely one that appears to function. G.H. Hardy famously stated, “Mathematics may be considered with precision, but how it cannot be without ambiguity.” This aligns with the challenges of translating cybersecurity objectives into verifiable agent behaviors. The graph-based memory system, a key innovation, attempts to impose order on the inherent ambiguity, ensuring that each action within the multi-agent system is logically connected and auditable – a nod towards mathematical rigor in a practical domain.

What Lies Ahead?

The presented work, while demonstrating a functional integration of large language models into an automated penetration testing workflow, merely scratches the surface of a deeper, more fundamental challenge. The system operates, demonstrably, but a formal verification of its ‘correctness’ remains conspicuously absent. The graph-based memory, while offering improved traceability, is still reliant on heuristics – a pragmatic concession, perhaps, but one that introduces the potential for systematic error. True elegance would demand a provable system, where each action is derived from a logical first principle, not an empirically observed correlation.

Future efforts should not focus on expanding the breadth of this system – adding more agents or more attack vectors – but on strengthening its logical foundations. The current reliance on LLMs as ‘reasoning engines’ is a temporary measure. A more robust solution would involve translating the LLM’s outputs into formal logic, allowing for rigorous analysis and validation. Only then can one speak of a genuinely reliable, autonomous system for cybersecurity assessment.

The ultimate goal is not simply to automate penetration testing, but to formalize the very concept of vulnerability. A system capable of mathematically defining and proving the absence of security flaws would represent a paradigm shift, moving beyond reactive measures to proactive, provable security. This, admittedly, is a lofty ambition, but the pursuit of mathematical purity is, after all, the only rational course.

Original article: https://arxiv.org/pdf/2603.24221.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Inherent Limitations of Manual Penetration Testing

A Multi-Agent System for Autonomous Penetration Testing

Operational Workflow and System Dynamics

Future Trajectories and Regulatory Considerations

What Lies Ahead?

See also: