Thinking Recommenders: Bridging AI and Cognition for Explainable Choices

Author: Denis Avetisyan

A new approach combines the power of large language models with a cognitive architecture to create recommendation systems that not only suggest items but also explain why.

CogRec facilitates instance recommendation through a defined process, enabling the selection of relevant items based on underlying system dynamics.

CogRec fuses large language models and the Soar architecture to achieve explainable, adaptable, and robust recommendations through neuro-symbolic integration and online learning.

Despite the promise of large language models (LLMs) in understanding user preferences, their “black-box” nature and limited adaptability hinder trustworthy recommendations. This paper introduces CogRec: A Cognitive Recommender Agent Fusing Large Language Models and Soar for Explainable Recommendation, a novel neuro-symbolic approach that synergistically combines LLMs with the structured reasoning of the Soar cognitive architecture. By leveraging Soar’s symbolic processing and online learning via chunking, CogRec not only enhances recommendation accuracy but also provides interpretable rationales for its choices. Could this integration unlock a new generation of adaptable and transparent recommender systems capable of addressing complex, real-world challenges?

The Limitations of Conventional Recommendation Systems

Conventional recommendation systems, such as those employing matrix factorization techniques like Bayesian Personalized Ranking Matrix Factorization (BPR-MF) or sequential models like Self-Attentive Sequential Recommendation (SASRec), frequently encounter limitations when addressing the nuances of user behavior. These methods often assume relatively stable preferences, struggling to capture the dynamic and multifaceted nature of individual tastes. Furthermore, a significant challenge arises with the ‘cold-start’ problem – accurately predicting preferences for new users or items with limited interaction data. Because these algorithms rely heavily on historical data, they exhibit diminished performance when faced with sparse datasets, leading to less relevant and personalized recommendations. This inability to effectively model complex preferences and handle new entities restricts their adaptability and overall effectiveness in real-world applications.

Many conventional recommendation systems operate as “black boxes,” delivering suggestions without articulating the reasoning behind them. This lack of transparency significantly hinders user trust, as individuals are less likely to adopt recommendations they don’t understand. Furthermore, the inability to explain why an item is suggested limits the system’s adaptability to evolving user contexts; a recommendation that seemed relevant yesterday might not be appropriate today, and without understanding the underlying logic, the system cannot readily adjust. The absence of explainability also prevents users from refining the system’s understanding of their preferences, creating a feedback loop that diminishes long-term accuracy and relevance. Consequently, a system’s utility is not merely about predicting what a user might like, but also about providing insights that empower them to explore and discover new interests with confidence.

A fundamental limitation of many recommendation systems lies in the pervasive issue of data sparsity. These algorithms rely on historical interactions – purchases, ratings, clicks – to predict future preferences, but a vast majority of potential item-user pairings lack any recorded interaction. This scarcity of data disproportionately affects predictions for new items, or those infrequently engaged with, as well as for users with limited historical data – the so-called “cold-start” problem. Consequently, the system struggles to accurately identify relevant items, often resorting to recommending popular items regardless of individual preference, or failing to provide personalized suggestions altogether. Addressing data sparsity is thus crucial for enhancing the accuracy and personalization capabilities of recommendation systems, and often necessitates techniques like collaborative filtering enhancements, content-based filtering, or the incorporation of auxiliary information to infer preferences from limited data.

The CogRec framework integrates a large language model as an external knowledge source within a Soar-centric cognitive cycle, enabling neuro-symbolic interaction and learning to translate user input into reasoned recommendations and explanations.

Introducing CogRec: A Synergistic Neuro-Symbolic Approach

CogRec utilizes a neuro-symbolic architecture by integrating Large Language Models (LLMs) with the Soar cognitive architecture. This integration addresses limitations inherent in both statistical and symbolic AI approaches. LLMs, while proficient at pattern recognition and prediction through statistical learning on extensive datasets, lack explicit reasoning capabilities and transparency. Conversely, symbolic systems like Soar excel at logical deduction and problem-solving but require extensive manual knowledge engineering. CogRec combines these strengths: the LLM provides data-driven insights and predictive power, while Soar furnishes a framework for structured reasoning, knowledge representation, and decision-making, effectively bridging the gap between data-driven inference and explicit symbolic manipulation.

CogRec’s neuro-symbolic integration enables both predictive and explanatory capabilities in recommendation systems. By combining the pattern recognition strengths of Large Language Models with the rule-based reasoning of the Soar architecture, the system doesn’t simply output a recommendation; it generates a trace of its decision-making process. This trace, composed of symbolic rules and justifications derived from both the LLM’s knowledge and Soar’s inference engine, provides a transparent account of why a specific recommendation was made. The resulting explanation details the chain of reasoning, linking user preferences, item attributes, and applied rules, offering a level of interpretability absent in traditional LLM-based recommendation systems.

The Soar cognitive architecture implements dynamic impasse resolution to address limitations in its knowledge base. When Soar encounters an inability to proceed with a task due to insufficient information – an impasse – it activates a mechanism to acquire the necessary knowledge. In CogRec, this involves formulating a query and directing it to the integrated Large Language Model (LLM). The LLM then generates relevant information, which Soar receives and incorporates into its working memory. This allows Soar to break down complex problems into smaller, resolvable steps and leverage the LLM’s statistical knowledge to overcome gaps in its symbolic reasoning capabilities, effectively continuing task execution.

CogRec’s Symbol-to-Text Converter automatically generates structured queries when encountering decision impasses, enabling more complex reasoning.

Dissecting CogRec: The PCA Cycle in Detail

The core of CogRec’s operation is the PCA Cycle, which begins with the Perception phase collecting data regarding both the user and the items being considered. This data includes explicit preferences, historical interactions, and item attributes. Subsequently, Cognition utilizes Soar’s Production Rule-Based Reasoning system to analyze this information. Soar operates by matching current situations in working memory to existing production rules, triggering actions and enabling inferences about user needs and item relevance. This rule-based approach allows CogRec to move beyond simple data retrieval and engage in a form of symbolic reasoning to determine the most appropriate recommendations or actions.

When the Soar cognitive architecture encounters an impasse – a situation where existing production rules cannot resolve a problem – an external Large Language Model (LLM) is invoked. The LLM provides relevant knowledge to break the impasse, and this information is not simply passed to Soar; instead, Soar’s chunking mechanism actively integrates the LLM’s output. Chunking creates new, generalized production rules from the specific experience of resolving the impasse with LLM assistance. These new rules are then stored in Soar’s long-term memory, effectively expanding the system’s knowledge base and enabling it to address similar situations autonomously in the future, reducing reliance on the LLM for subsequent encounters with comparable problems.

Within the Soar cognitive architecture, Working Memory (WM) serves as the central processing space where current goals, features of the environment, and retrieved knowledge converge. WM’s limited capacity necessitates efficient management of information, prioritizing relevance to current tasks. Long-Term Memory (LTM) provides storage for production rules – the ‘if-then’ statements that drive Soar’s behavior – and episodic memories representing past experiences. Retrieval from LTM into WM is content-addressed, meaning rules and memories are activated based on their similarity to the current WM state. This dynamic interplay between WM and LTM allows CogRec to rapidly access and apply previously learned knowledge, adapt to new situations, and improve performance over time through the creation and strengthening of production rules via chunking.

CogRec consistently outperforms its variants across all evaluated metrics, demonstrating its superior performance in the task.

Validating CogRec: Performance and Analytical Results

CogRec’s performance was assessed using three widely-adopted datasets representing different recommendation challenges: MovieLens-1M, consisting of one million movie ratings; Amazon Review (Movies), a larger dataset containing movie reviews and ratings; and Yelp, which provides business reviews and ratings. These datasets vary in size, sparsity, and the nature of user-item interactions, allowing for a comprehensive evaluation of CogRec’s adaptability. Specifically, MovieLens-1M focuses on explicit feedback in a relatively dense interaction space, while Amazon Review (Movies) and Yelp present more sparse data and incorporate implicit feedback through review text, thereby testing CogRec’s ability to handle diverse data characteristics and recommendation scenarios.

Evaluations of CogRec consistently demonstrated performance gains when measured against established baseline models – Bayesian Personalized Ranking with Matrix Factorization (BPR-MF), Self-Attentive Sequential Recommendation (SASRec), and Generative Pre-trained Transformer for Recommendation (GPT4Rec). Specifically, CogRec achieved higher Hit Rate (HR@10) and Normalized Discounted Cumulative Gain (N@10) scores across multiple datasets including MovieLens-1M, Amazon Review (Movies), and Yelp. HR@10, representing the proportion of user sessions where a relevant item appears within the top 10 recommendations, and N@10, a measure of ranking quality that considers the position of relevant items, both indicated a statistically significant improvement in recommendation accuracy compared to the aforementioned baselines. These metrics were calculated using standard information retrieval methodologies to ensure comparability and reproducibility of results.

CogRec incorporates explainability through the Chain of Thought (CoT) technique, which generates a rationale for each recommendation by outlining the user’s preferences and the item’s characteristics that led to the suggestion. This process involves decomposing the recommendation task into a series of intermediate reasoning steps, allowing users to understand why a specific item was recommended. Evaluation demonstrates that providing these CoT rationales increases user trust in the system and facilitates a deeper understanding of the recommendation logic, going beyond simply presenting a list of items. The generated explanations are based on the model’s internal knowledge and the user’s interaction history, offering interpretable insights into the decision-making process.

The Future of Intelligent Recommendations: Expanding the Horizon

The development of CogRec introduces a novel neuro-symbolic architecture designed to overcome limitations inherent in traditional recommendation systems. By integrating the strengths of neural networks – pattern recognition and adaptability – with symbolic reasoning, CogRec achieves robust performance even when faced with sparse data. This hybrid approach allows the system to generalize from fewer examples, constructing logical connections and inferring preferences beyond what is explicitly observed. Unlike systems reliant on massive datasets, CogRec can effectively learn and provide relevant recommendations in scenarios where user history is limited or unavailable, offering a pathway towards more personalized experiences and broadening the applicability of recommendation technology to new contexts and user bases. This architecture promises a significant leap towards truly intelligent systems capable of reasoning and adapting in dynamic environments.

The architecture of CogRec prioritizes not just what is recommended, but why, offering users transparent insights into the reasoning behind each suggestion. This emphasis on explainability moves beyond the “black box” problem common in many recommendation systems, allowing individuals to assess the validity of suggestions and build trust in the platform. By revealing the factors influencing a recommendation – such as specific features of an item or a user’s past preferences – CogRec empowers users to make informed decisions and actively participate in the recommendation process. Consequently, this increased understanding fosters greater engagement, as users are more likely to explore and interact with suggestions they perceive as relevant and justified, ultimately leading to higher satisfaction and a more positive user experience.

Researchers are actively broadening the scope of CogRec beyond its initial success in recommendation systems, concentrating on significantly expanding its underlying knowledge base to encompass a wider array of information and relationships. This expansion isn’t merely about accumulating data; it’s about building a more comprehensive and nuanced understanding of the world, allowing CogRec to reason and generalize more effectively. Current efforts are investigating the potential of this enhanced system in fields like personalized education, where it could tailor learning paths to individual student needs, and healthcare, offering support for diagnosis and treatment plans by synthesizing complex medical information. The ultimate goal is to create a versatile cognitive engine capable of providing insightful and trustworthy assistance across a multitude of domains, moving beyond simple predictions to offer genuine understanding and support.

CogRec’s architecture embodies a holistic approach to recommendation, mirroring the interconnectedness of cognitive systems. The fusion of Large Language Models and Soar isn’t merely a technical integration, but a deliberate attempt to model how knowledge is acquired and utilized in a dynamic environment. This resonates with Andrey Kolmogorov’s assertion: “The most important thing in science is not to be afraid of new ideas, but to test them rigorously.” CogRec’s online learning component, particularly its chunking mechanism, actively tests and refines its understanding of user preferences, embodying a commitment to empirical validation. The system’s ability to provide explanations isn’t an afterthought, but an inherent consequence of its symbolic reasoning core – a structure dictating behavior and fostering transparency within the recommendation process.

Where Do We Go From Here?

The fusion of connectionist and symbolic approaches, as demonstrated by CogRec, offers a compelling, if predictably complex, path toward truly intelligent recommendation systems. The immediate challenge lies not in achieving explainability – the system offers explanations – but in validating their fidelity. Does the articulated reasoning genuinely reflect the generative process, or is it a post-hoc rationalization, a comfortable narrative constructed to appease human scrutiny? The system’s reliance on chunking, while elegantly addressing the LLM’s limited context window, introduces a sensitivity to initial conditions and a potential for brittle generalization. Further work must investigate methods for dynamic chunk creation and evaluation, ensuring robustness against shifting data distributions.

Beyond the technical refinements, a more fundamental question remains. Current evaluation metrics primarily assess predictive accuracy – does the system guess correctly? – but offer little insight into whether it is fostering genuine user understanding or merely exploiting predictable patterns. A shift toward metrics that measure epistemic utility – the degree to which the system enhances a user’s knowledge and agency – is crucial. The pursuit of explainable AI is not merely about transparency; it’s about building systems that empower users, not simply predict their desires.

Ultimately, the success of this neuro-symbolic integration hinges on recognizing that intelligence is not a collection of isolated modules, but an emergent property of well-defined structure. Good architecture is invisible until it breaks, and only then is the true cost of decisions visible.

Original article: https://arxiv.org/pdf/2512.24113.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/