How Machines Think by Analogy

Author: Denis Avetisyan


New research sheds light on the internal processes large language models use to solve analogical problems, revealing how they map relationships between concepts.

Large language models demonstrate an ability to encode relational information crucial for analogical reasoning, yet the successful application of these relationships often presents a challenge equivalent to the initial encoding process, with structural alignment-quantified by the Mutual Alignment Score- proving strongly indicative of identifying analogous situations.
Large language models demonstrate an ability to encode relational information crucial for analogical reasoning, yet the successful application of these relationships often presents a challenge equivalent to the initial encoding process, with structural alignment-quantified by the Mutual Alignment Score- proving strongly indicative of identifying analogous situations.

A mechanistic interpretability study identifies relational encoding in mid-upper layers and structural alignment as key to successful analogical reasoning in large language models.

While large language models excel at pattern recognition, their capacity for true analogical reasoning-the ability to map relationships between concepts-remains an open question. This is the central focus of ‘The Curious Case of Analogies: Investigating Analogical Reasoning in Large Language Models’, a study revealing that successful analogy completion in LLMs hinges on encoding relational information within mid-upper layers and establishing strong structural alignment between analogous situations. However, the research also demonstrates that LLMs often struggle to generalize these relationships to novel entities, a limitation not typically observed in human cognition. Does this suggest a fundamental difference in how LLMs and humans approach relational thinking, and what architectural changes might bridge this gap?


The Illusion of Understanding

Contemporary language models frequently demonstrate impressive abilities in identifying and replicating patterns within data, often achieving high scores on benchmark tests. However, this proficiency frequently masks a fundamental limitation: a struggle with tasks demanding deep relational reasoning. These models excel at recognizing correlations – for example, associating “dog” with “bark” – but falter when required to understand the underlying connections between entities and concepts in more complex scenarios. This isn’t a matter of insufficient data; rather, the architecture of these models prioritizes surface-level associations over the ability to construct and manipulate representations of relationships, hindering their capacity for genuine comprehension and limiting their performance when faced with novel situations requiring flexible application of knowledge beyond simple pattern recall.

The limitations of current language models often arise not from a lack of data, but from an inability to effectively model the connections between pieces of information. These models excel at identifying patterns – recognizing that “king” often appears near “queen” or “castle” – but struggle when asked to understand how those entities relate, or to infer new relationships based on existing ones. True understanding, it appears, demands more than simply recognizing co-occurrence; it requires the capacity to represent entities and their interactions in a structured way, allowing for flexible manipulation and inference – a skill akin to building and navigating a complex network of knowledge where the strength and type of connection between concepts are as important as the concepts themselves. This relational reasoning deficit hinders performance on tasks requiring nuanced comprehension, highlighting a critical area for advancement in artificial intelligence.

Determining the true intelligence of language models requires a shift in evaluation metrics, moving past assessments of simple accuracy on specific datasets. While a model might perform well on questions it has already ‘seen’ during training, its capacity for genuine understanding hinges on its ability to generalize-to apply learned knowledge to entirely new situations and contexts. Researchers are increasingly focused on testing this generalization ability through carefully constructed scenarios that demand more than just pattern recognition; these tests probe whether the model can adapt its reasoning to unfamiliar circumstances, demonstrating a flexible and robust comprehension of the underlying principles at play. This emphasis on generalization offers a more nuanced and reliable gauge of a model’s cognitive capabilities, revealing whether it truly ‘understands’ or simply mimics understanding.

Successfully navigating complex reasoning tasks demands a foundation of pre-existing knowledge, and current language models frequently falter when this prerequisite is unmet. These models, while adept at identifying patterns in training data, often lack the broad understanding of the world necessary to interpret information and draw logical conclusions. Without sufficient background knowledge, a model may struggle to disambiguate meaning, identify relevant context, or make appropriate inferences – even when presented with seemingly straightforward prompts. Researchers are actively exploring methods to imbue these systems with more comprehensive world knowledge, ranging from incorporating vast knowledge graphs to developing techniques for dynamically retrieving relevant information during the reasoning process, ultimately aiming to move beyond rote memorization toward genuine understanding.

Successful decoding of relational information sharply decreases in incorrect cases, highlighting its importance for accurate answer resolution, while attributive information remains consistent regardless of correctness.
Successful decoding of relational information sharply decreases in incorrect cases, highlighting its importance for accurate answer resolution, while attributive information remains consistent regardless of correctness.

Mapping Relationships, Not Just Features

Structural alignment diverges from traditional analogical reasoning evaluations by prioritizing the identification of relational parallels over surface-level feature matching. This framework posits that the strength of an analogy is determined not by the similarity of the entities involved, but by the similarity of the relationships between those entities. Consequently, structural alignment assesses analogical capacity by mapping the roles and connections within different scenarios, effectively abstracting away from the specific content of those scenarios. This allows for the detection of analogies even when the constituent elements appear dissimilar, focusing instead on the underlying structural isomorphism between the compared situations.

Structural alignment fundamentally operates by establishing correspondences between discrete tokens within and across analogical statements. These tokens, which can be words, phrases, or other definable units of text, are compared to identify relational parallels. The process doesn’t focus on lexical matching but instead on determining if the relationships between tokens are consistent across different examples. For instance, identifying that ‘A is to B as C is to D’ requires pinpointing the corresponding roles of A, B, C, and D, irrespective of their specific content. This token-based correspondence mapping is the foundational step in quantifying structural similarity and assessing analogical reasoning capacity.

The Mutual Alignment Score (MAS) provides a quantifiable metric for evaluating structural alignment by calculating the degree of overlap in relational mappings between two analogical statements. Specifically, MAS assesses the proportion of tokens in one statement that correctly align with corresponding tokens in the other, considering not just the presence of matching tokens but also their relational roles. The calculation involves identifying all possible token pairings and determining the percentage that exhibit valid relational correspondence; a higher MAS value-ranging from 0 to 1-indicates a stronger degree of structural similarity and, consequently, a greater capacity for analogical reasoning. This scoring system allows for objective comparison of analogical capacity across different problem sets or reasoning agents, moving beyond subjective assessments of relational similarity.

Traditional assessments of reasoning often focus solely on the correctness of a final answer, failing to differentiate between solutions arrived at through insightful relational mapping versus those achieved through rote memorization or superficial pattern matching. Structural alignment, however, allows for evaluation of how a reasoning process unfolds, specifically examining the quality of identified relationships between elements. By quantifying the correspondence between tokens in different contexts, it moves beyond a binary “correct/incorrect” judgment to provide a graded metric reflecting the depth and accuracy of relational understanding. This nuanced approach is particularly valuable in identifying cognitive strengths and weaknesses, and can reveal analogical capabilities even when a final answer is inaccurate, offering a more complete picture of reasoning proficiency.

The relative mutual alignment score decreases consistently across layers, indicating diminishing correspondence between source and target activations compared to source and distractor activations.
The relative mutual alignment score decreases consistently across layers, indicating diminishing correspondence between source and target activations compared to source and distractor activations.

Peeking Inside the Black Box

Hidden Representations within neural networks constitute the internal, learned features that encode information from input data. These representations are not directly interpretable as human-understandable concepts, but are the basis for the network’s decision-making process. Analyzing these representations is crucial for understanding how a neural network arrives at a particular output, rather than simply observing what the output is. The complexity of these representations typically increases with network depth, with each layer building upon the previous to create increasingly abstract and nuanced features. Effective methods for decoding these Hidden Representations are therefore essential for improving model transparency, debugging errors, and ultimately enhancing overall performance.

Neural network hidden representations are constructed using two primary information types: Attributive Information and Relational Information. Attributive Information describes the inherent characteristics of individual entities within the data, such as color, size, or category. Relational Information, conversely, defines the connections and interactions between those entities, specifying how they relate to one another – for example, spatial relationships, hierarchical structures, or dependencies. The combined utilization of these two information types allows the network to build a comprehensive understanding of the data, moving beyond simple feature recognition to encompass contextual awareness and complex reasoning capabilities. Analysis indicates that both types of information are crucial for accurate representation and that improvements in processing either one can significantly impact overall model performance.

Patchscopes represent a new technique for analyzing the hidden representations within neural networks by employing generative modeling to interpret the model’s internal state. Unlike traditional probing methods that rely on static analysis or limited interventions, Patchscopes actively manipulate these representations and observe the resulting changes in model behavior. This is achieved by generating modified representations-or “patches”-and applying them to the model’s internal activations. By analyzing how these patches affect the model’s output, researchers can infer the meaning and function of specific hidden units and identify potential areas for improvement. The generative nature of Patchscopes allows for a more comprehensive and nuanced understanding of the model’s internal workings than previously possible.

Analysis of Patchscope utilization of both Attributive and Relational Information provides a means of interpreting the internal reasoning processes of neural networks. Empirical results demonstrate significant performance improvements achievable through targeted interventions based on these internal representations; correcting initially incorrect entity pairings yields gains of up to 38.4%. Furthermore, patching representations between entities-adjusting how the model understands their connections-results in an additional performance increase of up to 38.1%. These gains suggest that refinement of both entity characteristics and inter-entity relationships within the model’s hidden representations are critical for optimizing performance.

The heatmap visualizes the layers within the resolution token where Patchscopes successfully decodes e4e_4.
The heatmap visualizes the layers within the resolution token where Patchscopes successfully decodes e4e_4.

The Illusion of Intelligence and How We’re Trying to Break It

Proportional analogies, such as “man is to king as woman is to queen,” serve as a rigorous benchmark for evaluating a model’s capacity for semantic understanding and relational reasoning. These aren’t simply tests of memorization; they demand that a model discern the underlying relationship between concepts – in this case, gender to royalty – and then accurately apply that same relationship to a novel pair. A model successfully navigating proportional analogies demonstrates an ability to extract meaning beyond surface-level features, identifying that the relationship isn’t about men and kings themselves, but the shared role of gender in defining a corresponding royal title. This capability is crucial for genuine artificial intelligence, moving beyond pattern matching towards a more flexible and human-like capacity for reasoning across diverse contexts and, ultimately, enabling models to generalize knowledge to unseen scenarios.

Story analogies present a unique challenge to artificial intelligence, demanding more than simple pattern matching. These assessments evaluate a model’s capacity to identify relational similarities between narratives that differ significantly in their specific details – for instance, recognizing the parallel between a king losing his power and a company facing bankruptcy, despite the vastly different contexts. Successfully navigating such analogies requires a system to abstract away from surface-level features and focus on the underlying structural relationships, effectively demonstrating generalization capabilities beyond memorized examples. This approach moves beyond assessing whether a model knows something to determining if it can apply that knowledge flexibly to novel situations, mirroring a crucial aspect of human intelligence.

Analogical reasoning represents a cornerstone of human cognition, enabling the transfer of knowledge from well-understood situations to novel ones by identifying shared relational structures. This process doesn’t depend on superficial similarities, but rather on recognizing parallels in how things relate to each other – a concept akin to understanding that a king rules a kingdom in a way that a teacher governs a classroom. The brain leverages this ability to make predictions, solve problems, and learn efficiently, and it’s increasingly recognized as crucial for achieving general intelligence in artificial systems. Benchmarking artificial intelligence with tasks requiring analogical reasoning, therefore, moves beyond simply testing memorization and assesses a model’s capacity for flexible thought – its ability to see beyond the specific details of a problem and grasp underlying principles applicable to a wide range of contexts.

Current artificial intelligence models often excel at tasks requiring memorization, yet struggle with genuine reasoning that demands flexible application of knowledge to novel situations. Recent research addresses this limitation through analogical benchmarks – specifically, proportional and story analogies – designed to assess a model’s capacity for extracting semantic relationships and generalizing beyond surface-level patterns. Notably, investigations reveal a strong ability within these models to differentiate between true analogical relationships and mere lexical similarity; linear probe accuracy reaches 82.9% when examining the middle layers (layers 20-30) of the network. This suggests that the foundational capacity for analogical reasoning isn’t simply memorized, but rather is actively represented and processed within the model’s internal structure, offering a promising avenue for developing more robust and adaptable artificial intelligence.

Analysis of inter-story layer similarity reveals that analogous token pairs, despite differing surface forms, exhibit strong mutual correspondence as indicated by high similarity scores and highlighted matching layers.
Analysis of inter-story layer similarity reveals that analogous token pairs, despite differing surface forms, exhibit strong mutual correspondence as indicated by high similarity scores and highlighted matching layers.

The pursuit of elegant solutions in large language models feels… familiar. This research into analogical reasoning, with its focus on structural alignment and relational representation, simply confirms what postmortems reveal daily. Models, like people, stumble not from lacking information, but from failing to map it correctly. The attention knockout experiments, highlighting where reasoning breaks down, are merely diagnostics of a system failing to connect the dots. It’s a sophisticated echo of debugging legacy code. As Andrey Kolmogorov once stated, “The most important thing in science is not knowing many scientific facts, but knowing how to think.” The model can process information, but applying those relationships-that’s where the real fragility lies. They don’t deploy – they let go.

The Road Ahead

The dissection of analogical reasoning within these large language models reveals, predictably, that success hinges on encoding relationships. The surprise isn’t that relationships matter-it’s that anyone thought the models would spontaneously conjure understanding without them. The identified reliance on mid-upper layers for relational representation simply formalizes the observation that superficial pattern matching eventually fails. The ‘mutual alignment score’ offers a measurable metric, yet feels less like a breakthrough and more like a post-hoc justification for what competent systems already do.

Future work will undoubtedly refine these metrics and explore different architectural nudges. Expect a flurry of papers proposing ever-more-complex attention mechanisms, all promising to solve the ‘relational reasoning’ problem. However, the core issue remains: these models are still approximating intelligence, not embodying it. Each architectural innovation will inevitably introduce new failure modes, new edge cases where structural alignment collapses.

The field chases increasingly subtle forms of ‘generalization,’ while largely ignoring the fundamental problem of brittleness. The current trajectory suggests a future filled with increasingly sophisticated crutches, not actual understanding. It is not necessarily that more layers or parameters are the answer-it may be that the very premise of scaling these systems indefinitely is fundamentally flawed. The focus shouldn’t be on building better analogies, but on acknowledging the inherent limitations of symbolic manipulation.


Original article: https://arxiv.org/pdf/2511.20344.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-11-26 10:59