Securing the Rise of AI Agents

Author: Denis Avetisyan

As AI systems gain autonomy, a new approach to security is needed to ensure reliable and trustworthy operation.

This review introduces authenticated workflows-a systems-level trust layer for agentic AI built on cryptographic verification and deterministic policy enforcement.

While agentic AI promises workflow automation, existing security measures-reliant on probabilistic guardrails and semantic filters-are routinely circumvented. This paper introduces ‘Authenticated Workflows: A Systems Approach to Protecting Agentic AI’, presenting the first complete trust layer for enterprise agentic systems by enforcing cryptographic authenticity and intent at system boundaries-prompts, tools, data, and context. Through a novel AI-native policy language (MAPL) and a universal security runtime integrating leading frameworks, we achieve deterministic security with 100% recall and zero false positives. Can this approach finally deliver the robust, scalable security required to unlock the full potential of agentic AI in critical enterprise applications?

The Illusion of Perimeter Security

Traditional security strategies, historically focused on establishing strong perimeters-like castle walls around a network-are increasingly failing to protect valuable assets. This approach assumes threats originate from outside, neglecting the significant risk posed by compromised insiders or vulnerabilities embedded within the supply chain. Sophisticated attackers routinely bypass these defenses through techniques like phishing, social engineering, or by exploiting trusted relationships with third-party vendors. The interconnected nature of modern systems means a single compromised component – a software library, a cloud service, or a seemingly benign update – can provide attackers with a foothold deep inside the network, rendering perimeter security largely ineffective. Consequently, organizations are realizing the necessity of shifting towards more proactive, internal-focused security measures that emphasize continuous monitoring, threat detection, and robust access controls throughout the entire system lifecycle.

Traditional security architectures, designed for centralized, monolithic systems, increasingly falter when applied to the sprawling complexity of modern distributed environments. These systems, often comprising microservices, cloud infrastructure, and numerous interconnected devices, lack the inherent boundaries necessary for effective perimeter defense. Consequently, enforcing precise access control – determining exactly which entities can access specific resources – becomes exceptionally difficult. Instead of granular permissions, organizations often resort to broad allowances, creating substantial attack surfaces. This lack of granularity isn’t merely a technical limitation; it stems from the fundamental mismatch between security models built for simplicity and the dynamic, interwoven nature of contemporary systems, demanding a shift toward more adaptive and context-aware security solutions.

The increasing prevalence of agentic systems – interconnected networks where autonomous entities interact and pursue goals – presents a significant escalation in security challenges. Traditional security models, designed to protect static perimeters, struggle to comprehend the dynamic relationships and emergent behaviors within these systems. Each agent, potentially vulnerable and acting with a degree of independence, expands the attack surface exponentially. Securing such a system isn’t simply about protecting individual components, but about understanding and controlling the complex web of interactions between them. This necessitates a shift from preventative, perimeter-based security to a more adaptive, intent-based paradigm, one that focuses on verifying the trustworthiness of interactions and ensuring that collective behavior aligns with desired outcomes – a fundamental rethinking of how security is achieved in the age of intelligent, distributed systems.

Trust: A Layer, Not a Replacement

A dedicated Trust Layer functions as an intermediary security architecture situated between core infrastructure components and application-level defenses. This positioning allows for consistent policy enforcement and verification procedures before any interaction between these layers, independent of individual application or infrastructure security measures. The Trust Layer is not intended to replace existing security controls, but rather to augment them by providing a centralized point for establishing and validating trust, thereby reducing the attack surface and improving overall system resilience. It achieves this through mechanisms like policy definition, attestation verification, and secure logging, providing a consistent and auditable trust framework across the entire system.

The Trust Layer facilitates cryptographic attestations, which function as verifiable proofs that specific operations have been successfully completed. These attestations are not simply acknowledgements, but cryptographically signed statements enabling validation of process integrity. Crucially, this infrastructure supports sequential enforcement of workflow dependencies; an operation cannot proceed until the preceding operation’s completion is attested and verified. This ensures a defined order of execution and prevents unauthorized bypassing of required steps, contributing to a deterministic and auditable system state. The attestations themselves are data structures containing details of the completed operation, the agent performing it, and a digital signature allowing for non-repudiation and integrity checking.

Cryptographic verification at each agentic boundary crossing establishes a chain of trust by demanding proof of legitimacy before allowing data or control flow. This process involves verifying the authenticity and integrity of entities attempting to interact, ensuring that only authorized and uncompromised agents can proceed. Each successful verification step builds upon the previous one, creating an auditable record and limiting the blast radius of potential compromises. If an entity is found to be invalid or malicious at any boundary, the interaction is halted, preventing propagation of untrusted data or unauthorized actions, and thereby mitigating risks associated with compromised entities within the system.

Authenticated Workflows: Security in Motion

Authenticated Workflows implement security by enforcing cryptographic verification at each stage of interaction within an agentic system. This is achieved through Protocol-Level Security, meaning security measures are integrated directly into the communication protocols themselves, rather than being applied as an external layer. Specifically, every request and response is cryptographically signed and validated, confirming both the authenticity of the sender and the integrity of the data transmitted. This approach contrasts with traditional methods which often rely on verifying identity only at initial access and then trusting subsequent interactions, and enables continuous, granular authorization based on verifiable claims embedded within the protocol.

Traditional security models typically rely on perimeter defenses, establishing a fortified boundary around a system and verifying access only at the entry point. Authenticated Workflows diverge from this approach by implementing continuous authentication and authorization throughout the entire interaction lifecycle. This means that every request, data exchange, and function call within the agentic system is cryptographically verified, rather than solely relying on initial authentication. This shift moves the security focus from a static barrier to a dynamic, ongoing process, ensuring that each interaction is independently validated and that compromised credentials or internal attacks are immediately detected and mitigated, irrespective of the entry point.

Independent Verification and Defense in Depth are core tenets of this security architecture. Independent Verification ensures that each component’s outputs are validated by separate, non-trusting entities before being used by subsequent components, preventing propagation of compromised data. Defense in Depth layers multiple security mechanisms – including cryptographic verification, access controls, and anomaly detection – to create redundant safeguards. This layered approach mitigates the impact of any single point of failure; if one mechanism is bypassed or compromised, others remain operational to maintain system integrity and resilience against attack vectors. The implementation minimizes reliance on any single security measure, significantly reducing overall system vulnerability.

Rigorous testing demonstrates the system’s efficacy in preventing attacks with deterministic guarantees. Across 174 distinct test cases, the system achieved 100% recall – identifying all attack attempts – with zero false positives, indicating no erroneous flagging of legitimate activity. This performance extends to real-world scenarios; the system successfully mitigated two publicly disclosed Common Vulnerabilities and Exposures (CVEs) affecting OpenAI Atlas and GitHub MCP, validating its ability to address production-level security threats.

MAPL: Defining Trust with Cryptography

The MAPL Policy Language utilizes cryptographic attestations to define constraints on agents – software entities performing actions within a system – and to specify dependencies between workflow steps. These attestations, digitally signed statements verifying the state of a system or the execution of an operation, serve as the basis for policy definitions. An agentic constraint, expressed via attestation, limits an agent’s permissible actions; a workflow dependency, similarly defined, mandates the completion of a prior step before subsequent operations can proceed. This approach allows policies to be stated as verifiable claims about system behavior, moving beyond traditional access control lists to a more declarative and auditable security model. The use of cryptography ensures both the integrity and authenticity of policy statements and the data used to evaluate them.

MAPL facilitates the creation of security policies with a granularity exceeding traditional access control lists. These policies define permissible actions based on specific conditions and attributes of both the requesting agent and the accessed resource. Policy Enforcement Points (PEPs), strategically positioned at control surfaces – points where agents interact with resources – interpret and enforce these MAPL-defined policies. PEPs verify that each access request adheres to the established rules before granting access, providing a mechanism for automated and consistent security enforcement. The deployment of PEPs allows for decentralized policy enforcement, enabling scalable security across complex systems and adaptable control over resource access.

Surface Completeness, a core tenet of MAPL policy enforcement, mandates that all access operations within a system must traverse defined security boundaries. This requirement ensures that every data access, function call, or state transition is subject to policy evaluation at designated control surfaces. Consequently, all security-relevant events are observable and auditable, providing a comprehensive record of system behavior. By forcing all operations to explicitly cross these boundaries, MAPL eliminates implicit trust relationships and prevents unobserved or unenforced access, thereby facilitating complete monitoring and robust enforcement of defined security policies.

Byzantine systems, characterized by potential failures and malicious actors, traditionally present significant trust challenges. MAPL enhances security within these systems by shifting from reliance on trusted execution environments to verifiable attestations of policy compliance. This is achieved through cryptographic proofs demonstrating that agents have adhered to defined constraints and workflows. Unlike traditional methods vulnerable to internal compromise, MAPL’s attestation-based approach allows for external verification of system behavior, establishing a robust level of trust independent of the integrity of individual components. This enables reliable operation even in the presence of compromised or faulty nodes, improving the overall resilience and security posture of the Byzantine system.

The pursuit of securing agentic AI, as detailed in this paper, feels…predictable. This focus on authenticated workflows-bounding systems deterministically with cryptographic verification-is simply a more elaborate form of the same old problem: controlling entropy. Kolmogorov himself observed, “The most important thing in science is not to be afraid of new ideas, but to be willing to discard old ones.” Yet, here we are, layering complexity upon complexity, building ‘composable security’ as if detection-based approaches were failures of implementation, not inherent limitations. It’s a beautifully engineered solution, certainly, but one can’t help but suspect it will eventually become tomorrow’s tech debt, a new surface for attacks to emerge, especially as agentic systems inevitably find ways around even the most rigorous policy enforcement.

What’s Next?

The promise of authenticated workflows-deterministic bounds on agentic AI-is, predictably, more elegant in theory than it will prove to be in production. Every abstraction dies, and this one will be no exception. The current framing rightly targets prompt injection, but the surface area for adversarial behavior expands with every composable component. Future work will inevitably confront the limitations of cryptographic verification when applied to systems designed for emergent behavior-a deterministic trust layer attempting to contain a fundamentally unpredictable core.

A significant challenge lies in scaling these authenticated workflows beyond isolated demonstrations. The overhead of constant verification, policy enforcement, and key management will become a practical concern. The research will need to address the tension between robust security and acceptable performance, acknowledging that complete protection is an asymptotic goal, never truly reached. The inevitable compromises will be fascinating, if frustrating, to observe.

Ultimately, the true test won’t be whether these workflows prevent all attacks, but how gracefully they fail. Production environments are not laboratories. Every system will crash eventually; the question is whether that crash is accompanied by catastrophic data exposure or a controlled, auditable rollback. The pursuit of perfect security is a fool’s errand, but structured failure-that is a worthy endeavor.

Original article: https://arxiv.org/pdf/2602.10465.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Illusion of Perimeter Security

Trust: A Layer, Not a Replacement

Authenticated Workflows: Security in Motion

MAPL: Defining Trust with Cryptography

What’s Next?

See also: