Decoding Disease: AI Systems That Explain Their Diagnoses

Author: Denis Avetisyan


Researchers have demonstrated a new approach to medical diagnosis that combines the power of artificial intelligence with the ability to clearly articulate its reasoning.

The system transforms medical literature into an Answer Set Programming (ASP) program via a large language model, subsequently leveraging an ASP solver to arrive at a diagnosis tailored to individual patient data-a process that acknowledges the inevitable translation of theoretical knowledge into practical, and potentially imperfect, clinical application.
The system transforms medical literature into an Answer Set Programming (ASP) program via a large language model, subsequently leveraging an ASP solver to arrive at a diagnosis tailored to individual patient data-a process that acknowledges the inevitable translation of theoretical knowledge into practical, and potentially imperfect, clinical application.

A novel system, McCoy, leverages Large Language Models and Answer Set Programming to construct a knowledge base for accurate and explainable disease diagnosis.

Despite advances in artificial intelligence for healthcare, constructing robust and interpretable diagnostic systems remains challenging due to the intensive effort required for knowledge base creation. This paper introduces a novel framework, ‘A Proof-of-Concept for Explainable Disease Diagnosis Using Large Language Models and Answer Set Programming’, which bridges this gap by automatically translating medical literature into a logical knowledge base using large language models and reasoning with answer set programming. This integration yields a system capable of both accurate and explainable disease diagnosis, demonstrated through preliminary results on small-scale tasks. Could this approach pave the way for more transparent and reliable AI-driven healthcare solutions?


The Illusion of Diagnostic Mastery

Effective disease diagnosis is fundamentally reliant on a physician’s ability to synthesize information from a vast and continually expanding body of medical literature alongside the unique details of each patient’s presentation. This process isn’t merely about recalling facts; it demands a nuanced interpretation of research findings, clinical guidelines, and patient histories – often expressed in complex, ambiguous natural language. The diagnostic process involves identifying patterns, weighing probabilities, and applying logical reasoning to determine the most likely cause of illness, a cognitive feat that requires connecting disparate pieces of information. Consequently, the increasing volume and complexity of both medical knowledge and patient data present a significant challenge, underscoring the need for tools and methods that can effectively augment human diagnostic capabilities and ensure accurate, timely interventions.

Historically, converting the richness of clinical data – encompassing patient histories, physical exam findings, and lab results – into a form computers can process has proven remarkably difficult. Traditional diagnostic approaches rely heavily on physician expertise to synthesize this nuanced information, a process that isn’t easily replicated by algorithms. Existing methods often force complex medical details into rigid, pre-defined categories or rely on keyword matching, leading to a significant loss of context and potentially crucial insights. This simplification hinders the development of truly intelligent diagnostic systems, as the subtle relationships between symptoms, risk factors, and disease progression – essential for accurate assessment – are often lost in translation. Consequently, the gap between the complexity of real-world clinical data and the demands of automated reasoning remains a persistent challenge in the field of medical artificial intelligence.

Diagnostic systems striving to interpret medical texts and patient records face a core hurdle: the pervasive ambiguity of natural language. Medical language is replete with synonyms, contextual nuances, and implicit assumptions, requiring systems to move beyond simple keyword matching. A patient’s description of “discomfort” could indicate anything from mild indigestion to a life-threatening condition, and accurately discerning the intended meaning demands robust inference capabilities. These systems must not only parse the literal words but also contextualize them within the patient’s history, symptoms, and the broader medical literature, effectively reasoning about probabilities and possibilities to arrive at a defensible diagnosis. Without this capacity for nuanced understanding and logical deduction, diagnostic tools risk misinterpreting crucial information and generating inaccurate or incomplete assessments, highlighting the need for advanced natural language processing and knowledge representation techniques.

McCoy: Trading Ambiguity for Rules

The McCoy framework employs Large Language Models (LLMs) to automate the conversion of natural language medical text into a formal, rule-based format compatible with Answer Set Programming (ASP). This translation process involves identifying key entities, relationships, and clinical guidelines within the literature and representing them as logical rules. Specifically, the LLM is trained to extract facts and constraints from medical publications, which are then encoded into ASP rules using a predefined schema. This automated rule generation eliminates the need for manual knowledge engineering, allowing McCoy to rapidly build and update its knowledge base from a growing corpus of medical literature and facilitates reasoning via ASP solvers.

The conversion of unstructured medical text into a formal knowledge base within the McCoy framework involves extracting entities, relationships, and rules from natural language sources. This transformation utilizes Large Language Models to identify and categorize medical concepts, then represents them in a logic-based format compatible with Answer Set Programming (ASP). The resulting knowledge base consists of facts and rules that define the characteristics of diseases, symptoms, and their interconnections. This formal representation allows for deductive reasoning; given a patient’s symptoms (input facts), the ASP solver can apply the encoded medical rules to infer potential diagnoses and provide verifiable explanations for its conclusions, facilitating a transparent and auditable diagnostic process.

The McCoy framework improves upon traditional diagnostic systems by integrating the natural language processing capabilities of Large Language Models with the formal reasoning of Answer Set Programming. Conventional diagnostic tools often struggle with the ambiguity and variability of medical text, leading to inaccuracies or incomplete analyses. McCoy mitigates these issues by automatically converting unstructured medical literature into a structured, rule-based knowledge base that ASP can then utilize for precise inference. This combined approach has demonstrated an accuracy rate of 95-100% in diagnosing a selection of targeted diseases, representing a substantial improvement over systems reliant on solely statistical or knowledge-engineered methods.

Knowledge Base Construction: A Necessary Formalization

Prompt engineering is critical for effectively utilizing Large Language Models (LLMs) in medical knowledge base construction. Specifically, carefully crafted prompts direct the LLM to identify and extract relevant information – such as symptom-disease associations, treatment efficacy data, and diagnostic criteria – from unstructured medical literature. These prompts utilize techniques like few-shot learning, providing the LLM with examples of desired extractions, and constraint specification, defining the format and scope of the output. The quality of the extracted data is directly proportional to the precision and clarity of the prompts; ambiguous or poorly designed prompts can lead to inaccurate or incomplete knowledge representation, necessitating iterative refinement of the prompting strategy to achieve high fidelity and reliability in the formalized medical knowledge.

Knowledge Base construction within the McCoy framework utilizes Answer Set Programming (ASP) rules to represent medical knowledge. These rules define logical relationships between clinical entities: symptoms are linked to potential diseases, diseases are associated with specific treatments, and symptom co-occurrence can indicate particular conditions. Each ASP rule consists of a head and a body; the head defines a belief or conclusion, while the body specifies the conditions that must be met for that belief to be true. For example, a rule might state: “If patient exhibits symptom A AND symptom B, THEN suspect disease X.” The knowledge base is built by converting extracted medical literature into a comprehensive set of these ASP rules, creating a formal and computable representation of medical expertise. This structured format allows the McCoy system to perform logical reasoning and generate diagnostic hypotheses.

The constructed Knowledge Base functions as the core reasoning engine within the McCoy framework. It utilizes Answer Set Programming (ASP) rules to represent medical knowledge, enabling the system to infer potential diagnoses based on observed symptoms and patient data. These rules define logical relationships – for example, linking specific symptoms to diseases, or associating diseases with appropriate treatments. When presented with a patient’s information, McCoy employs an ASP solver to evaluate the rules within the Knowledge Base, identifying sets of diagnoses that are consistent with the evidence. The resulting answer sets are then ranked and presented as potential diagnoses, forming the basis for clinical decision support. The completeness and accuracy of this Knowledge Base directly impact the reliability and effectiveness of the McCoy diagnostic process.

Explainability: The Illusion of Understanding

The McCoy framework distinguishes itself through a fundamental commitment to Explainable AI, ensuring clinicians aren’t simply presented with a diagnosis, but also gain access to the underlying rationale. This isn’t merely about displaying data; the system is designed to articulate the precise chain of reasoning that led to a particular diagnostic suggestion. By prioritizing transparency, McCoy aims to move beyond the ‘black box’ limitations often associated with artificial intelligence in healthcare. It allows medical professionals to evaluate the system’s logic, assess the validity of the contributing factors, and ultimately integrate the AI’s insights with their own clinical judgment, fostering a collaborative approach to diagnosis and strengthening confidence in the technology.

The diagnostic process within this framework isn’t a ‘black box’; instead, tools like Xclingo facilitate a detailed visualization of the system’s reasoning. This means clinicians can trace the precise steps the AI took to arrive at a particular diagnosis, observing the specific rules activated and the patient data – symptoms, test results, and medical history – that triggered them. This isn’t simply presenting a final answer, but rather revealing how that answer was derived, effectively mapping the chain of inference. By highlighting the relevant evidence and the logical connections between data points and conclusions, the system fosters a deeper understanding of its rationale, moving beyond prediction to genuine explanation.

The ability of an AI diagnostic system to articulate its reasoning is not merely a technical feature, but a cornerstone of its practical utility and acceptance within clinical settings. When a system transparently reveals the evidence and logical steps that led to a particular diagnosis, it fosters a crucial sense of trust with the clinician. This transparency moves beyond a simple output of results, allowing medical professionals to critically evaluate the AI’s suggestions, integrate them with their own expertise, and ultimately make more informed decisions regarding patient care. Consequently, clinicians are empowered to identify potential errors or biases in the system’s reasoning, ensuring that AI serves as a supportive tool rather than an unquestioned authority, and leading to improved accuracy and patient outcomes.

The pursuit of automated diagnosis, as demonstrated by McCoy’s integration of Large Language Models and Answer Set Programming, often feels like chasing a mirage. The system meticulously constructs a knowledge base, striving for explainability – a noble goal. Yet, experience suggests that any attempt to perfectly model medical complexity will inevitably fall short. As Linus Torvalds aptly stated, “Most programmers think that if their code compiles, it automatically works.” Similarly, a system achieving diagnostic accuracy based on a constructed knowledge base shouldn’t inspire undue confidence. The elegance of the approach – linking LLM inferences to ASP reasoning – will likely be obscured by the sheer messiness of real-world medical data and unforeseen edge cases. The core concept of explainability is admirable, but the path to achieving it is paved with pragmatic compromises.

The Road Ahead

The construction of McCoy, a system integrating Large Language Models with Answer Set Programming for diagnostic reasoning, predictably introduces a new set of elegantly solvable problems. The immediate utility lies not in a revolution in medical diagnosis, but in the automation of knowledge base curation – a task previously reliant on brittle ontologies and human annotation. The system’s performance will, inevitably, plateau as the limits of LLM-derived knowledge become apparent. The real challenge isn’t achieving higher accuracy scores; it’s managing the inevitable degradation of those scores as medical literature evolves, and the LLM’s internal representations drift.

Future work will likely focus on mitigating the ‘hallucination’ problem inherent in LLMs, framing it as a knowledge integrity issue rather than a purely linguistic one. Expect to see increasingly complex methods for grounding LLM outputs in verifiable data, though a truly robust solution remains elusive. The current architecture, while providing a degree of explainability through Answer Set Programming, merely shifts the opacity – the underlying reasoning within the LLM itself remains a black box. The pursuit of ‘explainable AI’ often feels less like illumination and more like a sophisticated game of smoke and mirrors.

Ultimately, the field requires a recalibration of ambition. It doesn’t need more elaborate architectures, more powerful models, or more granular knowledge representations. It needs fewer illusions. The enduring problems in medical diagnosis aren’t technical; they are epistemic. No system, however cleverly constructed, can overcome the fundamental limitations of incomplete and uncertain information.


Original article: https://arxiv.org/pdf/2512.23932.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-01-01 23:58