From Instructions to Intelligent Programs

Author: Denis Avetisyan


A new framework streamlines the creation of neuro-symbolic programs by translating natural language into executable code.

The system architecture delineates a workflow structured around two phases-Knowledge Declaration and Model Declaration-where automated language model agents interact with directed human actions at decision points to process inputs and generate outputs, embodying a designed process for managed decay and iterative refinement.
The system architecture delineates a workflow structured around two phases-Knowledge Declaration and Model Declaration-where automated language model agents interact with directed human actions at decision points to process inputs and generate outputs, embodying a designed process for managed decay and iterative refinement.

This paper introduces AgenticDomiKnowS, an agentic workflow leveraging large language models to automate neuro-symbolic program synthesis.

Despite the promise of enhanced robustness and data efficiency, integrating symbolic constraints into deep learning remains a significant challenge for practitioners. This paper introduces ‘An Agentic Framework for Neuro-Symbolic Programming’ and presents AgenticDomiKnowS (ADS), a novel system that automatically generates neuro-symbolic programs from natural language instructions. By employing an agentic workflow, ADS eliminates the need for specialized programming expertise in frameworks like DomiKnowS, reducing development time from hours to minutes. Could this approach unlock broader adoption of neuro-symbolic AI by democratizing program creation and enabling rapid prototyping of complex knowledge-infused systems?


The Fragility of Pattern: Beyond Simple Scaling

Despite remarkable advancements, deep learning models frequently encounter challenges when tasked with complex reasoning or adhering to strict constraints. These systems demonstrate proficiency in identifying patterns within data-excelling at tasks like image classification and speech recognition-but often falter when requiring logical deduction, planning, or the application of rules. This limitation stems from their reliance on statistical correlations rather than explicit symbolic representation of knowledge. Consequently, even minor variations in input or the introduction of novel scenarios can lead to unpredictable outcomes, highlighting a fundamental gap between pattern recognition and genuine cognitive abilities. The inability to effectively manage constraints-such as physical laws or logical dependencies-restricts their application in domains demanding reliability and interpretability, such as robotics, medical diagnosis, and financial modeling.

The prevailing reliance on deep learning models often necessitates extraordinarily large datasets and substantial computational power. This demand isn’t merely a matter of scale; it fundamentally impacts a system’s adaptability and efficiency. Training these models, even with optimized algorithms, consumes significant energy and time, creating a barrier to deployment in resource-constrained environments or for rapidly changing tasks. Furthermore, the need for constant retraining with new data to maintain accuracy presents logistical challenges and ongoing costs. Consequently, the limitations imposed by these resource requirements are increasingly recognized as critical bottlenecks, prompting exploration into alternative AI paradigms that prioritize data efficiency and computational frugality.

Recognizing the inherent constraints of purely data-driven deep learning, researchers are increasingly focused on neuro-symbolic programming as a promising alternative. This emerging field seeks to combine the pattern recognition capabilities of neural networks with the explicit reasoning and knowledge representation of symbolic AI. By integrating these traditionally separate paradigms, systems can leverage the strengths of both – the ability to learn from raw data and the capacity to generalize, explain decisions, and operate effectively with limited data. This fusion allows for more robust, adaptable, and interpretable AI, potentially overcoming the scalability and reasoning bottlenecks that currently plague many deep learning applications and paving the way for more human-like cognitive abilities in artificial intelligence.

A DomiKnowS program for the WIQA task integrates conceptual graphs and logical constraints with sensor code that links properties and predictive models to the graph's concepts.
A DomiKnowS program for the WIQA task integrates conceptual graphs and logical constraints with sensor code that links properties and predictive models to the graph’s concepts.

DomiKnowS: Constructing a Foundation of Meaning

DomiKnowS utilizes Conceptual Graphs (CGs) as its primary knowledge representation format. CGs are a formal system for representing meaning based on a graph structure consisting of concepts as nodes and relations as edges. This declarative approach encodes domain knowledge and logical constraints by explicitly defining entities, attributes, and the relationships between them. Unlike traditional knowledge representation methods, CGs facilitate both semantic representation and logical inference; constraints are directly embedded within the graph structure, enabling the system to perform reasoning through graph traversal and pattern matching. The graph-based nature of CGs also allows for efficient storage and retrieval of knowledge, as well as facilitates knowledge sharing and reuse across different applications and domains.

The use of a declarative knowledge representation in DomiKnowS enables explicit reasoning processes and constraint satisfaction mechanisms. This approach differs from purely data-driven methods by allowing the system to articulate why a particular conclusion was reached, thereby improving interpretability. By explicitly defining relationships and limitations within the knowledge graph, the system can verify the validity of its inferences against these constraints, leading to increased accuracy and reduced instances of logically inconsistent outputs. Constraint satisfaction ensures that solutions adhere to predefined rules, minimizing errors and enhancing the reliability of the system’s responses.

Neuro-Symbolic Programming, as implemented within DomiKnowS, integrates the strengths of both deep learning and symbolic reasoning approaches. Deep learning components facilitate pattern recognition and learning from data, enabling the system to generalize and handle noisy or incomplete information. Simultaneously, symbolic reasoning provides a mechanism for representing explicit knowledge, performing logical inference, and ensuring verifiable correctness. This combination allows DomiKnowS to move beyond purely data-driven approaches, offering improved accuracy, robustness, and interpretability by grounding learned representations in explicit knowledge and logical constraints. The framework benefits from the ability to learn complex relationships from data while maintaining the ability to reason about those relationships in a transparent and verifiable manner.

AgenticDomiKnowS provides an interactive environment for program construction directly from natural language input. This framework enables users to articulate desired functionalities in plain language, which are then translated into executable programs within DomiKnowS. The system employs an agentic approach, decomposing complex tasks into smaller, manageable steps handled by individual agents. These agents collaborate to achieve the overall objective, facilitating a modular and interpretable programming process. The interactive nature of AgenticDomiKnowS allows for real-time feedback and refinement of the generated programs, ensuring alignment with user intent and enabling iterative development.

The human reviewer interface displays generated graph drafts alongside agent verdicts and a workflow history, allowing users to approve code or provide natural language feedback for refinement.
The human reviewer interface displays generated graph drafts alongside agent verdicts and a workflow history, allowing users to approve code or provide natural language feedback for refinement.

Automated Knowledge Engineering with LLM Agents

AgenticDomiKnowS automates Knowledge Declaration through the coordinated operation of three LLM Agents. The Graph Design Agent is responsible for initial knowledge graph construction, defining concepts and relationships. Following design, the Graph Execution Agent populates the graph with data by querying external sources and assigning relevant information to defined concepts. Finally, the Graph Reviewer Agent validates the constructed graph, identifying and correcting inconsistencies or inaccuracies to ensure data quality and logical coherence before knowledge is declared.

The Model Declaration stage within DomiKnowS establishes connections between graph concepts and external data sources and learning mechanisms. Specifically, Sensors are assigned to graph concepts to facilitate data ingestion from various sources, while LLMModels, leveraging the capabilities of GPT-5, are linked to enable reasoning, inference, and learning directly on those concepts. This assignment process allows DomiKnowS to dynamically integrate real-world data with the knowledge graph and utilize the LLM to enhance understanding and derive new insights from the interconnected concepts, effectively creating a self-learning knowledge system.

DomiKnowS leverages Integer Linear Programming (ILP) to address constraint satisfaction problems inherent in knowledge graph construction and maintenance. ILP allows for the formal representation of relationships between graph concepts as a set of linear constraints, where variables represent the existence or absence of specific relationships or attributes. The system then employs established ILP solvers to find optimal solutions that satisfy these constraints, effectively resolving conflicts and ensuring data consistency within the knowledge graph. This approach is particularly useful for complex scenarios involving numerous interconnected concepts and conflicting information, enabling DomiKnowS to automatically derive logical conclusions and maintain a coherent knowledge representation.

The DomiKnowS system leverages FastAPI, a modern, high-performance web framework for building APIs, to handle backend logic and data processing related to knowledge engineering tasks. The frontend user interface is constructed with Next.js, a React framework that provides features such as server-side rendering and static site generation, improving performance and user experience. This combination facilitates interactive knowledge declaration, model configuration, and graph visualization through a web browser, allowing users to define sensors, assign LLM models, and review the resulting knowledge graphs without requiring specialized software installations or command-line interfaces. FastAPI’s asynchronous capabilities further enhance responsiveness, while Next.js optimizes the delivery of the user interface elements.

Validation and Broad Applicability

DomiKnowS demonstrates a notable capacity for complex reasoning, as evidenced by its successful application to the WIQA Task – a challenge designed to assess understanding of relational reasoning. The framework achieves this through the strategic implementation of the Transitivity Relation, a fundamental principle of logic. By recognizing that if A relates to B, and B relates to C, then A must also relate to C, DomiKnowS ensures internal consistency within its knowledge representation. This approach allows the system to draw inferences and make connections beyond explicitly stated facts, ultimately improving its performance on tasks demanding nuanced understanding and logical deduction. The framework’s ability to leverage transitive relationships offers a powerful mechanism for navigating complex datasets and constructing coherent, reliable knowledge graphs.

DomiKnowS exhibits strong categorization abilities when applied to practical challenges like spam detection and hierarchical image classification, achieving accuracy through the implementation of logical constraints. This approach moves beyond simple pattern recognition by explicitly defining relationships between categories – for example, establishing that an email containing certain keywords and originating from a specific domain is definitively classified as spam. Similarly, in image classification, the framework doesn’t merely identify objects; it understands their relationships within a hierarchy – recognizing a ‘Siamese cat’ as a specific type of ‘cat’, which itself falls under the broader category of ‘mammal’. By encoding these logical dependencies, DomiKnowS minimizes misclassifications and ensures a more consistent and reliable categorization process, proving effective across diverse datasets and tasks.

DomiKnowS distinguishes itself through a fundamentally modular architecture coupled with automated knowledge engineering tools, enabling remarkably swift prototyping and adaptation to previously unseen domains. This design bypasses the traditional bottleneck of manual knowledge base construction; instead, the system dynamically builds and refines its understanding through automated processes. Consequently, researchers and developers can rapidly deploy DomiKnowS to tackle novel challenges without extensive pre-configuration or specialized expertise. The framework’s components – encompassing knowledge acquisition, reasoning, and validation modules – function as independent units, facilitating easy customization and extension. This adaptability isn’t limited to software modifications; the automated features actively learn from new data, allowing DomiKnowS to evolve its capabilities and maintain relevance across a diverse spectrum of applications and datasets.

DomiKnowS incorporates mechanisms for continuous human oversight, ensuring the system’s outputs consistently reflect desired user goals. These ‘Human-in-the-Loop’ processes aren’t simply about error correction; they actively guide the framework’s learning and adaptation. By allowing users to validate, refine, and provide feedback on the system’s reasoning and knowledge construction, DomiKnowS avoids the pitfalls of rigid automation. This iterative approach enables the framework to subtly calibrate its internal logic, becoming increasingly attuned to nuanced user preferences and evolving task requirements. Consequently, DomiKnowS isn’t merely a tool that performs tasks, but a system that learns to perform them better, mirroring a collaborative intelligence between human expertise and artificial reasoning.

AgenticDomiKnowS demonstrates a remarkable capacity for automated knowledge construction, achieving a high degree of accuracy in its knowledge declarations. When paired with the Kimi-K2 language model, the system attains an impressive 97.22% accuracy, indicating a robust ability to formulate and assert factual information. Even with the GPT-5 (low) model, a still-significant 86.11% accuracy is observed, highlighting the framework’s adaptability and resilience across varying language model capabilities. This performance suggests that AgenticDomiKnowS can reliably build a knowledge base with minimal human intervention, paving the way for more autonomous and intelligent systems capable of complex reasoning and problem-solving.

The adaptability of DomiKnowS is demonstrably proven through its successful implementation across markedly different datasets, including the nuanced sentiment analysis required by Amazon Reviews, the complex citation network of the WOS Dataset, and the intricate linguistic structures within the CoNLL Dataset. This performance indicates the framework isn’t limited by data type or task complexity; it effectively manages both textual and network-based information, along with the challenges of natural language processing. The consistent completion of tasks on these diverse datasets highlights a core strength: DomiKnowS’s capacity to abstract underlying principles and apply them effectively, regardless of the specific domain or data presentation.

The efficiency of DomiKnowS is demonstrably high, as experienced users consistently complete complex tasks within a timeframe of just 10 to 15 minutes. This rapid task completion isn’t simply about speed; it reflects the framework’s streamlined design and automated knowledge engineering capabilities. By minimizing the need for manual intervention and facilitating quick prototyping, DomiKnowS allows users to swiftly address challenges across diverse datasets – including Amazon Reviews, WOS, and CoNLL – and achieve impactful results with minimal time investment. This level of efficiency positions DomiKnowS as a practical solution for applications demanding both accuracy and responsiveness.

The pursuit of automated program synthesis, as detailed in this work with AgenticDomiKnowS, inherently acknowledges the inevitable entropy of complex systems. Just as a chronicle meticulously records a system’s evolution, AgenticDomiKnowS aims to capture and translate intent into functional code, mitigating the decay that arises from manual, error-prone program authoring. Barbara Liskov observed, “Programs must be correct, but they also must be understandable.” This sentiment resonates deeply with the framework’s focus on knowledge declaration and neuro-symbolic representation, striving not only for functional correctness but also for a clarity that resists the obscuring effects of time and complexity. The framework doesn’t simply build programs; it constructs a lineage, a traceable history of intent made manifest.

What Lies Ahead?

The pursuit of automated program synthesis, as exemplified by frameworks like AgenticDomiKnowS, is less about achieving a final product and more about elegantly postponing inevitable entropy. Every line of code, even that generated by an agent, is a temporary reprieve from the second law. The current iteration offers a compelling scaffolding for neuro-symbolic reasoning, but it acknowledges, implicitly, that knowledge declaration remains a brittle act. Future work will likely focus not on expanding the scope of instruction, but on building systems that gracefully degrade when faced with ambiguity – systems that remember their limitations.

Versioning, in this context, is a form of memory; each refinement of the agentic workflow a palimpsest layered over prior assumptions. The arrow of time always points toward refactoring. A critical next step involves moving beyond purely declarative knowledge and embracing procedural understanding – enabling the agent to not merely know what is true, but how to discover it. This will require a deeper integration of exploration and exploitation within the agentic architecture.

Ultimately, the longevity of such systems will depend not on their ability to generate perfect programs, but on their capacity to adapt – to learn from their mistakes and to evolve alongside the ever-shifting landscape of knowledge. The true measure of success will be the elegance with which these systems age, not the illusion of immortality they attempt to create.


Original article: https://arxiv.org/pdf/2601.00743.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-01-05 16:10