AI Agents Unlock New Paths to Drug Repurposing

Author: Denis Avetisyan


A new conversational AI system empowers researchers to identify potential drug candidates for various diseases without requiring specialized bioinformatics expertise.

ChatDRex leverages multi-agent technology and knowledge graphs for accessible, network-based disease module identification and drug repurposing prediction.

Despite the promise of accelerated drug discovery through repurposing, realizing this potential requires bridging the gap between complex bioinformatics analyses and accessible tools for clinical experts. This challenge is addressed in ‘Conversational no-code and multi-agentic disease module identification and drug repurposing prediction with ChatDRex’, which presents a novel conversational AI system that democratizes access to network-based drug repurposing. By integrating a knowledge graph with a multi-agent architecture, ChatDRex enables users without specialized coding skills to explore potential drug candidates through natural language interactions. Will this approach catalyze a new era of translational research and personalized medicine by empowering clinicians to directly leverage the power of bioinformatics?


Navigating the Complexities of Therapeutic Discovery

The pursuit of novel therapeutics has historically been a protracted and financially demanding undertaking, frequently yielding limited success in addressing the inherent intricacies of disease. Conventional drug discovery typically focuses on single molecular targets, neglecting the reality that illnesses rarely stem from isolated malfunctions but rather from disruptions within complex biological systems. This reductionist approach often results in therapies that address symptoms without resolving the underlying causes, or that inadvertently trigger unintended consequences due to off-target effects. The average cost to bring a single drug to market now exceeds $2.6 billion, a figure largely attributable to the high failure rate observed during clinical trials-a consequence of the difficulty in accurately predicting how a drug will interact with the multifaceted web of proteins, genes, and pathways that govern physiological processes. Consequently, a paradigm shift is needed to move beyond single-target approaches and embrace strategies that account for the systemic nature of disease.

The human body isn’t a collection of isolated parts, but a vast, interwoven network where genes, proteins, and metabolites constantly interact. Network biology recognizes this fundamental truth, positing that disease isn’t simply a malfunction of a single gene or protein, but a disruption within these complex relationships. Consequently, traditional drug discovery methods, focused on single targets, often fall short. Addressing this requires computational approaches capable of mapping and reasoning over these intricate biological networks – techniques that can identify key nodes and pathways driving disease. By modeling these connections, researchers can move beyond single-target approaches and explore strategies that modulate entire systems, potentially leading to more effective and robust therapies. This systems-level understanding is crucial for deciphering the underlying causes of complex diseases and designing interventions that restore network homeostasis.

The pursuit of effective therapies is often stalled by a critical bottleneck: the inability of current computational methods to fully leverage the wealth of biological data available. While genomic, proteomic, and clinical datasets offer unprecedented insights into disease, these sources remain largely siloed and difficult to synthesize. Existing algorithms struggle to navigate the intricate web of interactions within biological networks – the complex relationships between genes, proteins, and metabolites – which are crucial for understanding disease mechanisms. This limitation hinders the identification of potential drug repurposing candidates, as it prevents researchers from accurately predicting how existing drugs might impact these interconnected systems. Consequently, promising therapeutic opportunities are often overlooked, extending the time and cost associated with bringing new treatments to patients and underscoring the need for more sophisticated network-based approaches.

An Agentic System for Drug Repurposing

ChatDRex employs a Multi-Agent System (MAS) architecture, differentiating it from monolithic Large Language Models. This design distributes cognitive tasks across multiple specialized agents, each responsible for a specific function such as knowledge retrieval, reasoning, or output generation. Each agent operates autonomously, communicating and collaborating to achieve a common goal – in this case, drug repurposing predictions. The MAS approach allows for modularity and scalability, enabling the integration of diverse tools and algorithms. Furthermore, it facilitates focused reasoning by assigning specialized roles, improving the accuracy and interpretability of the system’s outputs compared to a single, general-purpose model.

ChatDRex combines Large Language Models (LLMs) with network-based knowledge resources to enhance drug repurposing capabilities. Specifically, it integrates with DIAMOnD, a knowledge graph containing drug-disease associations and biological information. Prioritization of potential drug candidates is achieved through algorithms including TrustRank, which identifies drugs associated with highly ‘trusted’ diseases, and Closeness Centrality, which assesses a drug’s proximity to multiple diseases within the network. This combination allows ChatDRex to leverage both the reasoning capabilities of LLMs and the structured, interconnected data within DIAMOnD to generate and rank drug repurposing predictions.

Chain-of-Thought (CoT) prompting is implemented within ChatDRex to enhance the system’s reasoning capabilities and improve the interpretability of drug predictions. This technique involves structuring prompts to encourage the Large Language Model to articulate a series of intermediate reasoning steps before arriving at a final prediction. By explicitly detailing the rationale behind each step – such as identifying relevant biological pathways, assessing drug mechanisms of action, and considering potential off-target effects – CoT provides a transparent audit trail of the decision-making process. This approach facilitates verification of the system’s logic, allows for identification of potential biases or errors, and ultimately increases confidence in the generated drug repurposing hypotheses.

The Foundation: NeDRex Knowledge and Infrastructure

ChatDRex utilizes the NeDRex platform as its foundational knowledge source for drug repurposing initiatives. This platform is built upon the NeDRex KG (Knowledge Graph), a curated collection of biomedical data designed to represent relationships between drugs, diseases, and related entities. Access to this knowledge is facilitated through the NeDRexAPI, which provides a programmatic interface for querying and retrieving information from the knowledge graph. The NeDRex KG encompasses data from multiple sources, including databases, scientific literature, and clinical trials, and is continuously updated to ensure the information provided to ChatDRex is current and reliable.

The ChatDRex system architecture utilizes LangChain4j, a Java/Kotlin framework for developing applications powered by language models, and Quarkus, a Kubernetes-native Java framework optimized for cloud environments. This technology stack provides a foundation for both rapid development and high performance. Quarkus’s fast startup times and low memory footprint contribute to efficient resource utilization, while LangChain4j facilitates the integration and management of large language models (LLMs) within the system. The combination ensures the platform can scale to accommodate increasing data volumes and user requests, maintaining responsiveness and reliability as computational demands grow.

The ChatDRex conversational interface utilizes Retrieval-Augmented Generation (RAG) techniques to improve response accuracy by grounding Large Language Model (LLM) outputs in a verified knowledge base. Specifically, RAG dynamically retrieves relevant information from the NeDRex Knowledge Graph prior to LLM processing, mitigating the risk of hallucination and enhancing factual correctness. The LLMs themselves are hosted using Ollama, a framework that facilitates efficient deployment and management of LLMs, contributing to a responsive user experience and reducing computational demands. This combination of RAG and local LLM hosting allows ChatDRex to provide more reliable and timely drug repurposing insights.

The ChatDRex system incorporates data from Semantic Scholar to supplement its core knowledge graph and enhance reasoning capabilities. Semantic Scholar is utilized as an external knowledge source, providing access to a comprehensive corpus of scientific literature, including research articles, abstracts, and citations. This integration allows ChatDRex to retrieve and incorporate the latest findings relevant to drug repurposing inquiries, improving the accuracy and comprehensiveness of its responses. The system queries Semantic Scholar based on the user’s input and the context of the conversation, extracting pertinent information to support its reasoning process and provide up-to-date insights.

Refining Reasoning with Advanced Analytical Techniques

ChatDRex employs In-Context Learning (ICL) as a method to enhance the performance of Large Language Models (LLMs) during the inference stage. ICL involves providing the LLM with a limited number of illustrative examples, directly within the input prompt, that demonstrate the desired task or reasoning process. Rather than updating model weights, ICL guides the LLM’s response generation by establishing a contextual framework based on these provided examples. This approach allows ChatDRex to leverage pre-trained LLMs without requiring extensive fine-tuning for specific biomedical reasoning tasks, improving accuracy and relevance of generated outputs by conditioning the model on relevant, task-specific demonstrations.

The DIGEST algorithm assesses the functional coherence of identified Disease Modules by evaluating the interconnectedness of genes within a module based on shared biological functions and pathways. This evaluation utilizes Gene Ontology (GO) term enrichment analysis and pathway database searches to determine if the genes within a module exhibit statistically significant overlap in their associated biological processes, molecular functions, and cellular components. A high degree of functional coherence, as determined by DIGEST, indicates that the identified Disease Module is likely biologically relevant and supports the validity of predictions derived from it. The algorithm quantifies coherence using a scoring system based on p-values and enrichment factors, providing a metric for assessing the reliability of the module’s association with a specific disease.

Performance evaluations of ChatDRex indicate a high degree of accuracy in specific functional areas. The system achieves an average Tool Accuracy of 0.86, representing the correctness of external tool utilization. Call Accuracy, measuring the precision of function calls made by the system, is reported at 0.852. Answer Accuracy, which assesses the overall correctness of the final responses generated, is 0.61. These metrics were determined through standardized evaluation procedures and provide quantitative data regarding the system’s performance capabilities.

The NeDRex Knowledge Graph (KG) underwent performance evaluation utilizing the F₁-score, a metric representing the harmonic mean of precision and recall. This evaluation assessed the KG’s effectiveness in accurately retrieving relevant biological information. The F₁-score considers both the proportion of retrieved instances that are relevant and the proportion of relevant instances that are successfully retrieved, providing a balanced measure of knowledge retrieval performance. Higher F₁-scores indicate a more effective KG in identifying and delivering pertinent data for downstream analysis and reasoning tasks.

Towards a Future of Personalized Therapeutic Interventions

The next phase of research centers on tailoring drug repurposing predictions to the individual patient. Recognizing that disease manifestation and drug response vary significantly between individuals, future models will incorporate patient-specific data – encompassing genomic profiles, medical history, lifestyle factors, and real-time monitoring data – to refine predictions. This personalized approach moves beyond population-level efficacy, aiming to identify drugs most likely to be effective for a particular patient’s unique disease profile. By leveraging the power of machine learning to analyze these complex datasets, researchers anticipate a substantial increase in the success rate of repurposed therapies and a reduction in adverse drug reactions, ultimately paving the way for precision medicine in drug development.

The predictive power of drug repurposing systems is fundamentally limited by the comprehensiveness of the underlying biological knowledge. Current approaches often rely on relatively sparse data regarding gene expression, protein interactions, and known drug mechanisms. Integrating multi-omics data – encompassing genomics, transcriptomics, proteomics, and metabolomics – promises to create a far more nuanced and accurate representation of disease pathways and drug targets. This expanded knowledge graph allows the system to identify previously unseen connections between drugs and diseases, improving prediction accuracy and robustness. By considering the complex interplay of molecular components, the system can move beyond simple target-based predictions and account for off-target effects, compensatory mechanisms, and individual patient variability, ultimately increasing the likelihood of successful repurposing candidates.

A critical next step involves translating this complex computational work into an accessible tool for the broader scientific and medical communities. Development of a user-friendly interface will remove significant barriers to entry, allowing researchers and clinicians – even those without extensive bioinformatics expertise – to readily explore potential drug repurposing candidates. This democratization of access fosters innovation by enabling wider experimentation and validation of predictions, ultimately accelerating the translation of in silico discoveries into tangible benefits for patients. By simplifying the process of querying the knowledge graph and interpreting results, this interface will empower a more collaborative and efficient approach to identifying new therapeutic uses for existing drugs, moving beyond specialist labs and into everyday practice.

The architecture of ChatDRex embodies a philosophy where systemic understanding dictates successful outcomes. The system’s multi-agent approach, facilitating conversational access to complex bioinformatics tools, reflects an appreciation for interconnectedness. As John McCarthy aptly stated, “The best way to predict the future is to invent it.” ChatDRex doesn’t merely analyze existing data; it actively constructs possibilities by enabling researchers to explore disease modules and drug repurposing predictions through natural language. This proactive invention, built upon a robust knowledge graph and modular design, suggests that if the system survives on duct tape, it’s probably overengineered – ChatDRex aims for elegant, sustainable solutions, not temporary fixes.

What’s Next?

The elegance of ChatDRex lies in its attempt to abstract away complexity. If the system looks clever, it’s probably fragile. The immediate challenge isn’t simply scaling the knowledge graph – though that remains a considerable undertaking – but assessing the provenance and reliability of the information it contains. A network is only as strong as its weakest link, and a conversational interface offers little protection against confidently-delivered nonsense. The true test will be predictive accuracy, not ease of use.

Furthermore, the multi-agent architecture, while promising, begs the question of agency itself. These agents aren’t reasoning; they’re pattern-matching at scale. The illusion of conversation shouldn’t be mistaken for genuine understanding. Future iterations must grapple with the distinction between statistical correlation and biological causality. The system currently identifies potential candidates; discerning true efficacy remains the domain of rigorous experimentation.

Architecture, after all, is the art of choosing what to sacrifice. ChatDRex sacrifices computational depth for accessibility. This is a reasonable trade, but a temporary one. The field will inevitably move toward systems that integrate both conversational interfaces and sophisticated modeling – systems that don’t merely suggest drugs, but explain why they might work. That will require a level of transparency currently absent, and a willingness to confront the inherent limitations of even the most elegant design.


Original article: https://arxiv.org/pdf/2511.21438.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-11-28 15:01