Planning with Guardrails: A New Language for Reliable AI

Author: Denis Avetisyan

Researchers have developed a system that enforces data isolation in AI planning workflows, enhancing transparency and auditability.

NormCode is a semi-formal language and execution framework for context-isolated AI planning, improving the reliability of complex, LLM-orchestrated data flows.

Complex, multistep AI workflows built upon large language models are increasingly susceptible to errors arising from accumulating contextual noise. This paper introduces NormCode: A Semi-Formal Language for Context-Isolated AI Planning, a novel approach that enforces strict data isolation between workflow steps, eliminating cross-step contamination. By separating semantic and syntactic operations, and offering isomorphic representations for authoring, execution, and verification, NormCode enables auditable and reliable AI planning. Could this framework unlock greater transparency and trust in high-stakes applications like legal reasoning and medical decision-making?

The Inevitable Erosion of Context

Large language model agents, despite their impressive capabilities, frequently encounter a phenomenon termed ‘context pollution’ as they undertake complex reasoning tasks. Each step in their thought process – each inference, calculation, or retrieved piece of information – is appended to the existing context window. This accumulation, while seemingly beneficial, introduces noise and inconsistencies; the model struggles to differentiate between vital information and irrelevant details from earlier steps. Consequently, the quality of output degrades as the reasoning chain lengthens, leading to unreliable and unpredictable results. Essentially, the very process of thinking pollutes the foundation upon which further thought is built, creating a self-defeating cycle that limits the scalability and trustworthiness of these agents in demanding applications.

Current AI frameworks, prominently including LangChain and AutoGPT, frequently struggle with maintaining consistent reasoning due to a fundamental architectural limitation: a lack of isolated inference pathways. As these agents execute multi-step tasks, each reasoning step and generated output is often appended to the existing context window, creating a cumulative record that can introduce noise and irrelevant information. This ‘context pollution’ hinders the agent’s ability to accurately assess subsequent steps, leading to errors and unreliable outputs as the process unfolds. Without dedicated mechanisms to compartmentalize and refine inference – to effectively ‘forget’ irrelevant details or focus solely on the current analytical need – these frameworks struggle to build truly complex, verifiable AI systems capable of sustained, logical thought.

The construction of truly complex artificial intelligence systems is currently hampered by a fundamental limitation in maintaining reliable performance across multiple reasoning steps. Without the capacity to isolate individual inferences, current architectures struggle to prevent the accumulation of errors and irrelevant information – a phenomenon known as context pollution. This lack of isolation isn’t merely a matter of decreased accuracy; it actively prevents the development of AI capable of consistently verifiable results, particularly in domains demanding precision and auditability. Consequently, building systems that reliably navigate intricate, multi-step problems – such as complex scientific simulations, legal reasoning, or financial modeling – remains a significant challenge, as even minor initial errors can cascade into substantial deviations over time, undermining the integrity of the final outcome.

Architecting for Isolation: The NormCode Approach

NormCode utilizes a semi-formal language to facilitate context-isolated AI planning, a design choice implemented to mitigate the issue of context pollution. Context pollution occurs when irrelevant or previously processed data unduly influences subsequent reasoning steps, leading to unpredictable or incorrect outputs. The NormCode language achieves isolation by strictly defining the boundaries of information flow within a planning process; data is explicitly passed between distinct reasoning modules and is not implicitly shared or retained outside of these defined pathways. This enforced isolation ensures that each planning step operates on a controlled and predictable data set, enhancing the reliability and reproducibility of AI-driven decision-making processes.

Within NormCode, an ‘Inference’ represents a discrete, self-contained unit of reasoning. Each Inference is characterized by a clearly defined set of inputs and a corresponding set of outputs, both of which are explicitly specified. This structure promotes modularity by allowing complex plans to be decomposed into smaller, manageable units. Furthermore, the explicit definition of inputs and outputs facilitates traceability; the lineage of any output can be directly traced back to its originating inputs and the specific Inference that generated it, enabling debugging and verification of the reasoning process. This design is fundamental to preventing context pollution and maintaining consistent, reliable AI planning.

NormCode achieves data isolation by strictly defining the inputs and outputs of each ‘Inference’ unit; data generated within one Inference is not directly accessible by others unless explicitly passed as an input to a subsequent Inference. This enforced separation prevents unintended information leakage between reasoning steps and eliminates the potential for context pollution, where prior inferences inappropriately influence later ones. By controlling data flow, NormCode ensures that each Inference operates on a defined and consistent dataset, contributing to predictable and reproducible reasoning outcomes across multi-step planning processes. The system relies on explicitly defined interfaces, rather than shared global state, to maintain data integrity and traceability.

The Reference System: Structuring Data for Controlled Access

The NormCode Reference System is a data encapsulation and management framework designed to structure data along named axes, effectively creating a multi-dimensional space for data organization. This approach moves beyond simple linear data storage by associating data with specific identifiers along each axis, allowing for precise addressing and retrieval. Crucially, this system enforces data isolation by restricting access based on these named axes; operations can only access data explicitly associated with the currently active axes, preventing unintended modification or exposure of unrelated data elements. This granular control over data access is a core component of NormCode’s security and data integrity features.

Perceptual Signs within the NormCode Reference System function as data pointers that prioritize on-demand data loading. Rather than immediately retrieving and storing complete datasets, these signs maintain minimal metadata referencing the data’s location. This deferred loading strategy significantly reduces initial memory consumption and improves operational efficiency, as data is only accessed when explicitly required by a Syntactic or Semantic Operation. The implementation avoids unnecessary data transfer and storage, optimizing resource utilization, particularly within large-scale data environments and complex computational processes.

All data manipulation within the NormCode Reference System is confined to operations categorized as either ‘Syntactic’ or ‘Semantic’. Syntactic Operations represent deterministic processes – predictable data transformations executed through defined rules. Conversely, Semantic Operations encompass AI-driven reasoning, specifically leveraging Large Language Model (LLM)-based agents to interpret and process data. Critically, both operation types are strictly contained within the isolated environment of the Reference System, preventing external access or modification of managed data and ensuring operational consistency.

From Authoring to Assurance: A Complete Ecosystem

NormCode establishes a comprehensive lifecycle for plan development through its ‘Three-Format Ecosystem’. This system utilizes distinct file types – .ncds for authoring, enabling intuitive plan creation and modification; .ncd for execution, translating authored plans into actionable steps for an AI agent; and .ncn for verification, rigorously testing the plan’s correctness and safety. This interconnectedness ensures a seamless transition from initial concept to deployed application, fostering a feedback loop where verification results inform and refine subsequent authoring iterations. By encapsulating the entire plan development process within these three formats, NormCode facilitates not only the creation of AI plans but also provides a robust framework for building confidence in their predictable and reliable behavior.

The core of NormCode’s operational framework lies within its ‘Orchestrator’, a system designed to meticulously manage ‘Agent Sequences’. These sequences function as complex pipelines of inferences, where each step relies on the successful completion of prior stages. The Orchestrator doesn’t simply execute these inferences in order; it actively tracks dependencies between them, ensuring that data flows correctly and that each agent receives the necessary information at the appropriate time. This careful scheduling isn’t merely about efficiency; it’s crucial for maintaining the integrity of the overall plan, preventing errors that could arise from improperly sequenced operations, and ultimately, guaranteeing a predictable and reliable outcome from the AI system.

The development of robust and dependable artificial intelligence necessitates more than just advanced algorithms; it demands a system capable of proving its own correctness. This is achieved through formal specification and verification, a process where AI plans are expressed in a mathematically rigorous language and then subjected to automated checks to ensure they adhere to predefined safety and reliability standards. By translating intentions into formal logic, the system can definitively establish whether a plan will always behave as expected, eliminating ambiguity and potential for error. This capability is paramount in critical applications, such as autonomous vehicles or medical diagnosis, where even a minor flaw could have significant consequences. The resulting assurance builds trust not only in the system’s functionality, but also in its predictable and safe operation, fostering wider adoption and responsible innovation in the field of AI.

Self-Hosting and the Future of Architectural Resilience

NormCode distinguishes itself through a unique capacity for self-hosting, a principle where the system is implemented using the language itself. This isn’t merely a theoretical construct; the NormCode compiler is, in fact, a NormCode plan – a program written within the language to build and execute other programs. This self-referential architecture fosters a powerful level of expressiveness and flexibility, allowing for meta-programming and the dynamic creation of new functionalities without relying on external tools or languages. By embedding the compilation process within its own framework, NormCode achieves a level of autonomy and internal consistency that could pave the way for more adaptable and self-improving artificial intelligence systems.

NormCode distinguishes itself through an inherent ability to define intricate operations directly within its own framework, mirroring the capabilities of Planning Domain Definition Language (PDDL). This internal expressiveness transcends simple computation; it enables the language to not only execute instructions but also to describe how to solve problems, effectively embedding a problem-solving engine within its structure. By allowing complex procedures to be defined as NormCode plans themselves, the system achieves a level of meta-programming sophistication rarely seen in traditional architectures. This allows for dynamic adaptation and the creation of highly specialized functions without relying on external libraries or pre-defined code, potentially leading to more flexible and efficient AI systems capable of self-modification and emergent behavior.

The NormCode system’s reliability is powerfully demonstrated through its flawless performance on base-X addition tasks, as detailed in recent research. This achievement isn’t merely a successful computation; it establishes a quantifiable benchmark for evaluating the system’s core functionality and inherent stability. Achieving 100% accuracy across varying numerical bases-from binary to more complex systems-validates NormCode’s capacity for precise calculation and consistent output, showcasing its potential as a foundation for more intricate computational processes. This benchmark serves as a crucial indicator that the language can consistently and accurately execute defined operations, even as complexity increases, making it a promising platform for self-hosted AI architectures and beyond.

The pursuit of auditable AI, as detailed in this work concerning NormCode, echoes a sentiment expressed long ago by Carl Friedrich Gauss: “I would rather explain one difficult concept well than ten simple ones poorly.” This language, deliberately semi-formal, isn’t about simplifying the complexities of AI planning; it’s about making those complexities explicit. The framework’s emphasis on context isolation isn’t a restriction, but a necessary constraint – a way to illuminate the data flow and prevent the inevitable entropy of opaque systems. Long stability, a seemingly desirable trait, often masks a brittle architecture unable to adapt. NormCode, by embracing transparency, acknowledges that systems don’t fail-they evolve, and understanding that evolution requires a language precise enough to chart its course.

The Garden Grows

NormCode attempts to draw lines around the unruly growth of LLM orchestration – to define boundaries where none naturally exist. It is a necessary, if ultimately temporary, measure. The illusion of complete isolation is quickly shattered by the simple fact that data, like water, will always find the lowest point. The true challenge isn’t preventing leakage, but designing for forgiveness – systems that degrade gracefully when boundaries blur, rather than collapsing under the weight of unexpected inputs.

Future work will inevitably confront the tension between explicitness and expressiveness. A language that demands absolute transparency risks becoming too brittle to adapt to the ever-shifting landscape of AI models. The goal shouldn’t be a perfect accounting of every data transformation, but a system capable of revealing, with reasonable fidelity, the intent behind the computation. This requires a shift from tracking data lineage to understanding behavioral patterns.

One suspects that the pursuit of “auditable AI” is less about achieving perfect control and more about building trust in a fundamentally opaque process. A system isn’t a machine to be disassembled and understood; it’s a garden. One doesn’t audit a garden, one observes its growth, learns from its failures, and tends to it with humility. The language, then, is merely a tool for the gardener – a means, not an end.

Original article: https://arxiv.org/pdf/2512.10563.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/