Smarter Scholarly Search: Unlocking Insights with AI

Author: Denis Avetisyan

Researchers now have a powerful new tool to navigate the ever-growing body of scientific literature, powered by artificial intelligence.

A retrieval-augmented generation system enhances scholarly search by first ranking relevant articles, then integrating their content into the prompt before leveraging a large language model to formulate a response, effectively grounding the generated answer in existing knowledge.

This paper introduces ORKG ASK, an open-source system combining knowledge graphs, vector search, and large language models for transparent and reproducible scholarly literature exploration.

The exponential growth of scholarly literature presents a significant challenge to efficient knowledge discovery. Addressing this, we introduce ORKG ASK-an AI-driven system detailed in ‘Introducing ORKG ASK: an AI-driven Scholarly Literature Search and Exploration System Taking a Neuro-Symbolic Approach’-that combines vector search, Large Language Models, and knowledge graphs for enhanced literature exploration. This neuro-symbolic approach enables ASK to not only retrieve relevant articles, but also synthesize answers to complex research questions using Retrieval-Augmented Generation. Will this transparent and reproducible system redefine how researchers navigate and leverage the ever-expanding landscape of scientific knowledge?

Beyond Keyword Search: Navigating the Complexity of Scholarly Discovery

Scholarly search historically prioritizes identifying documents containing specific keywords, a method that often overlooks the intricate connections and contextual subtleties within research. This reliance on lexical matching frequently results in a fragmented understanding, as nuanced arguments, implicit assumptions, and relationships between concepts are lost in the process. A study of citation networks, for instance, revealed that highly influential papers are not necessarily those with the most keyword occurrences, but rather those that synthesize existing knowledge in novel ways – a capability traditional search struggles to recognize. Consequently, researchers may miss crucial insights hidden within the broader scholarly landscape, hindering their ability to build upon prior work and accelerate discovery; the limitations of keyword-based systems necessitate more sophisticated approaches capable of capturing the semantic richness of scientific literature.

The sheer volume of published scientific research is escalating at an unprecedented rate, presenting a significant challenge to researchers attempting to stay current in their fields. Estimates suggest that millions of new papers are added to the scientific record annually, far exceeding any individual’s capacity for comprehensive review. This exponential growth isn’t simply a matter of increased data; it necessitates a shift from traditional information retrieval to more intelligent knowledge discovery systems. These systems must move beyond simply matching keywords to understanding the complex relationships between concepts, identifying emerging trends, and synthesizing information across disparate sources. Consequently, researchers are increasingly reliant on tools capable of curating, filtering, and interpreting the vast landscape of scientific literature, effectively transforming raw data into actionable insights and preventing critical knowledge from being obscured by sheer volume.

Current scholarly search systems frequently falter when confronted with inquiries demanding the integration of knowledge dispersed across numerous publications. These systems excel at identifying documents containing specific terms, but struggle to synthesize information – to connect disparate findings, identify underlying trends, or resolve conflicting data. This limitation isn’t merely an inconvenience; it actively impedes the advancement of research. Researchers are often forced to manually sift through countless articles, a process that is both time-consuming and prone to overlooking crucial connections. Consequently, the ability to formulate genuinely novel insights – to move beyond simply locating information and towards understanding complex relationships – remains a significant challenge, despite the ever-increasing volume of available research.

ASK: A Neuro-Symbolic Framework for Knowledge Exploration

ASK utilizes a Retrieval-Augmented Generation (RAG) framework, which functions by first retrieving relevant documents from a knowledge source using information retrieval techniques. These retrieved documents are then provided as context to a large language model (LLM), allowing the LLM to generate responses grounded in factual information rather than solely relying on its pre-trained parameters. This process mitigates the LLM’s tendency to hallucinate or produce inaccurate information and enhances its ability to address complex queries by accessing and synthesizing external knowledge. The RAG architecture, therefore, combines the LLM’s generative capabilities with the reliability of information retrieval systems.

ASK’s core architecture utilizes a neuro-symbolic approach by integrating neural networks with symbolic knowledge representations, specifically Knowledge Graphs. This integration allows the system to leverage the pattern recognition capabilities of neural networks with the structured, logical reasoning enabled by Knowledge Graphs. Knowledge Graphs provide a formalized representation of entities and their relationships, while neural networks process and understand unstructured data. By combining these approaches, ASK can move beyond simple keyword matching and perform more complex inferences based on both semantic similarity and explicit knowledge, enabling a more robust and explainable knowledge exploration process.

The ASK architecture integrates semantic search and reasoning capabilities by combining neural network processing with symbolic knowledge representations. Semantic search, facilitated by the neural network component, identifies relevant information based on the meaning of queries rather than keyword matching. Simultaneously, the system leverages symbolic reasoning, utilizing Knowledge Graphs to infer new knowledge and relationships from existing data. This allows ASK to not only retrieve relevant research but also to synthesize information, identify connections between concepts, and provide more nuanced and comprehensive answers to complex research questions, exceeding the capabilities of systems relying solely on statistical language modeling or keyword-based retrieval.

The ASK search results page displays nodes representing the implementation of functional and non-functional requirements detailed in Table 1.

Engineering Semantic Precision: Vector Search and Large Language Models

ASK’s search functionality relies on vector search, a method of information retrieval that represents articles and queries as high-dimensional vectors. These vectors are generated using the Nomic Embedding Model, which translates text into numerical representations capturing semantic meaning. Similarity between the query vector and article vectors is then calculated using cosine similarity or other distance metrics. Articles with the highest similarity scores are identified as being semantically relevant to the query, even if they lack explicit keyword matches. This process allows ASK to retrieve information based on conceptual relevance rather than solely on lexical terms, improving search recall and surfacing relevant research that traditional keyword-based methods might miss.

Traditional information retrieval systems rely heavily on keyword matching, which limits results to documents containing the exact search terms. ASK, however, utilizes semantic search, evaluating the meaning of the query and comparing it to the semantic meaning of articles within its database. This is achieved through vector embeddings, numerical representations of text that capture contextual relationships between words and concepts. By comparing the vector representation of the query to those of the articles, ASK can identify documents relevant to the query’s intent, even if those documents use different terminology or phrasing. This capability significantly expands the scope of relevant results beyond what keyword-based systems can achieve, improving recall and delivering more comprehensive answers.

Following the vector search process, the Mistral Large Language Model (LLM) is utilized to formulate responses. This LLM doesn’t simply reiterate retrieved text; it synthesizes information from multiple relevant articles identified by the vector search, creating a coherent and informative answer to the user’s query. The Mistral LLM’s capabilities include natural language generation, allowing it to present the synthesized information in a readable and contextually appropriate format, effectively serving as a reasoning engine on top of the retrieved knowledge.

ASK incorporates a Non-Parametric Memory system to augment the knowledge available to the underlying Large Language Model (LLM). This system functions by storing embeddings of external documents – in this case, research articles – in a vector database. During query processing, relevant documents are retrieved from this database via semantic similarity search and provided as context to the LLM. This effectively extends the LLM’s knowledge base beyond the data it was originally trained on, allowing ASK to answer questions and generate responses based on a dynamically updated and expanded corpus of information without requiring model retraining or fine-tuning.

ASK utilizes tailored system prompts, including priming in some cases (P2), to guide the LLM based on user input and retrieved context, with red highlighting indicating dynamically injected user values.

Validating the Impact: User Experience and Efficiency Gains

Evaluations utilizing the NASA Task Load Index (TLX) consistently reveal that ASK markedly diminishes the cognitive burden experienced by users when compared to conventional search methodologies. The TLX, a widely recognized metric for assessing perceived mental demand, performance, effort, temporal demand, physical demand, and frustration levels, showed a significant decrease in overall task load for individuals utilizing ASK. This suggests that the system effectively streamlines the information-seeking process, allowing users to accomplish research goals with less mental exertion and improved efficiency. By reducing the cognitive load, ASK not only enhances the user experience but also potentially mitigates errors and promotes more thorough investigation of complex topics.

User evaluations consistently indicated a reduced cognitive burden when researchers utilized ASK compared to conventional search tools like Google Scholar. These findings, gathered through the NASA Task Load Index (TLX), revealed that participants experienced significantly less mental demand, physical effort, temporal demand, performance, effort, and frustration while completing research tasks with ASK. This suggests that the system’s design effectively streamlines the information retrieval process, allowing users to focus on analysis and synthesis rather than struggling with complex searches or sifting through irrelevant results. The observed reduction in perceived task load not only enhances the user experience but also potentially improves the quality and efficiency of research endeavors.

A substantial dataset was compiled through operational feedback gathered directly from the integrated ASK interface, encompassing the experiences of 1212 users. This wealth of real-world interaction data provided invaluable insights into system performance and user behavior beyond controlled laboratory settings. The feedback mechanism, seamlessly incorporated into the ASK experience, allowed for continuous monitoring of user satisfaction and identification of areas for improvement. Analysis of this large-scale user input revealed patterns in query formulation, feature utilization, and perceived system effectiveness, ultimately informing iterative design enhancements and ensuring ASK remains aligned with user needs and expectations.

Data captured through Matomo web analytics indicates a remarkably low bounce rate of 3% for the ASK system, a metric suggesting substantial user engagement. This figure signifies that only 3 out of every 100 users navigate away from the system immediately after viewing a single page, contrasting sharply with typical bounce rates observed on many web platforms. Such consistent interaction implies users find the information presented by ASK relevant and the interface conducive to continued exploration, bolstering the claim that the system effectively supports knowledge discovery and reduces the need for users to initiate new search queries elsewhere. The low bounce rate, therefore, provides compelling evidence for ASK’s usability and its capacity to maintain user attention throughout the research process.

Evaluations of the ASK system yielded an average Unified Measure of User Experience (UMUX) score of 65.7, which denotes a moderate level of usability. This score, derived from standardized questionnaires, suggests that while users generally find the system functional and acceptable, there remains potential for refinement in areas such as ease of learning or aesthetic appeal. A moderate UMUX score is often considered a solid foundation, indicating the system successfully meets core user needs without necessarily exceeding expectations; further iterative design and user testing could elevate the experience toward a higher level of satisfaction and efficiency. The metric provides a valuable benchmark for tracking improvements as the system evolves and adapts to user feedback.

Evaluations reveal that ASK actively diminishes the potential for Large Language Model (LLM)-induced hallucinations – instances where the model generates factually incorrect or nonsensical information. This mitigation isn’t achieved through algorithmic adjustments to the LLM itself, but rather through a robust grounding mechanism. ASK prioritizes basing its responses on directly retrieved knowledge from a curated data source, effectively anchoring the LLM’s output in verifiable facts. By consistently referencing this retrieved information, the system minimizes reliance on the LLM’s potentially flawed internal knowledge, leading to more reliable and trustworthy responses. This approach ensures that even when faced with ambiguous or complex queries, ASK delivers answers firmly rooted in established data, rather than speculative or fabricated content.

Operational feedback collection results vary depending on the specific question asked.

Towards the Future of Scholarly Exploration: Expanding Horizons

Ongoing research prioritizes a tighter coupling between Knowledge Graphs and Large Language Models (LLMs) to significantly improve analytical reasoning. Currently, LLMs excel at pattern recognition and textual generation, but often struggle with complex inferential tasks requiring structured knowledge. By integrating LLMs with Knowledge Graphs – which represent information as interconnected entities and relationships – researchers aim to provide a robust framework for logical deduction and nuanced understanding. This synergistic approach allows LLMs to not only process information but also to reason about it, verifying responses against established knowledge and identifying potential inconsistencies. The anticipated outcome is an artificial intelligence capable of moving beyond simple information retrieval towards genuine scholarly insight, enabling more accurate hypothesis generation and accelerating the pace of scientific discovery through enhanced computational reasoning.

The continued evolution of ASK relies heavily on a dramatically expanded knowledge base, achieved through the growth of the CORE dataset and the integration of increasingly diverse data sources. Currently, ASK draws from a substantial collection of open-access research, but its potential is limited by gaps in coverage and representation. Future development prioritizes the inclusion of datasets beyond traditional academic publications – encompassing preprints, clinical trial records, patents, and even datasets derived from grey literature. This broadened scope isn’t simply about quantity; it’s about providing a more holistic and nuanced understanding of scientific inquiry, allowing ASK to connect disparate pieces of information and identify emerging trends often missed by conventional search methods. Such a comprehensive approach promises to significantly enhance ASK’s ability to provide insightful and relevant responses to complex scholarly questions.

Current research prioritizes the refinement of prompt engineering to unlock the full potential of large language models in scholarly contexts. This involves moving beyond simple question-and-answer formats to explore techniques like chain-of-thought prompting, where the model is encouraged to articulate its reasoning steps, and retrieval-augmented generation, which integrates external knowledge sources directly into the response generation process. Furthermore, investigations are underway to develop automated prompt optimization strategies, leveraging machine learning to identify prompt structures that consistently yield more accurate, relevant, and nuanced responses. The ultimate goal is to create a system where complex scholarly inquiries are met with not just information, but with well-reasoned, contextually aware insights, effectively transforming language models from information retrieval tools into collaborative research assistants.

The architecture of ASK is designed to coalesce into a pivotal resource for scholarly investigation, addressing the escalating challenge of information overload within research fields. By synthesizing data from diverse sources and employing advanced knowledge representation, ASK aims to move beyond simple information retrieval, instead offering researchers a dynamic platform for knowledge discovery. This system isn’t merely intended to locate relevant papers, but to actively connect concepts, identify emerging trends, and facilitate interdisciplinary exploration. Ultimately, ASK seeks to empower researchers by dramatically reducing the time spent navigating the expanding landscape of scientific knowledge, and instead allowing them to focus on innovation and critical analysis – effectively serving as a cognitive partner in the pursuit of new understanding.

ORKG ASK demonstrates a commitment to building systems where transparency and reproducibility are not afterthoughts, but foundational principles. The integration of knowledge graphs, vector search, and large language models, while complex, aims to illuminate the reasoning behind search results – a critical step towards fostering trust in AI-driven scholarly exploration. As Brian Kernighan aptly stated, “Complexity adds maintenance cost.” ORKG ASK seeks to manage this complexity by offering a modular and open-source design, allowing researchers to understand and verify each component. This mindful approach acknowledges that good architecture is invisible until it breaks, and only then is the true cost of decisions visible.

What’s Next?

The pursuit of intelligent scholarly search invariably reveals a fundamental tension: optimization in one area predictably introduces fragility elsewhere. ORKG ASK, by integrating vector search, large language models, and knowledge graphs, offers a compelling step towards transparent retrieval-augmented generation. However, architecture is the system’s behavior over time, not a diagram on paper. The immediate challenge lies not simply in scaling this approach-though that remains significant-but in characterizing the emergent properties of such a hybrid system. What biases are subtly amplified by the interplay between semantic and statistical retrieval? How does the system’s ‘understanding’ of a query evolve with the knowledge graph, and at what point does that evolution become opaque even to its creators?

Reproducibility, rightly emphasized in this work, is less a destination and more a continuous calibration. Open source is a necessary, but insufficient, condition. True reproducibility demands detailed provenance tracking not only of the code and data, but also of the reasoning process itself. The current focus on LLM prompting and knowledge graph construction obscures a deeper issue: the very definition of ‘relevance’ is fluid and context-dependent. A system that merely finds papers is less valuable than one that helps a researcher navigate the landscape of uncertainty inherent in any complex field.

Future work should therefore prioritize not just performance metrics, but also the development of tools for introspective analysis. The goal is not to build a ‘perfect’ search engine, but a characterizable one-a system whose limitations are as readily apparent as its strengths. Only then can researchers truly leverage these tools as partners in discovery, rather than black boxes dispensing information.

Original article: https://arxiv.org/pdf/2512.16425.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/