The Drive to Learn: Building AI with Intrinsic Motivation

Author: Denis Avetisyan

A new framework prioritizes learning progress and efficient resource use, enabling AI agents to pursue knowledge and adapt within realistic constraints.

This review details an approach to agency-gain through constrained optimization, predictive compression, and improved interface quality for resource-bounded agents.

Despite advances in artificial intelligence, creating truly adaptive and useful systems requires moving beyond purely predictive models to agents operating under realistic constraints. This need motivates the ‘Artificial Agency Program: Curiosity, compression, and communication in agents’, which proposes a framework for building resource-bounded agents driven by learning progress and efficient resource allocation-observation, action, and computation. The core of this program unifies concepts from information theory, thermodynamics, and bounded rationality to prioritize agency-gain through predictive compression, empowerment, and streamlined communication as selective information bottlenecks. Will this approach unlock a new generation of AI systems seamlessly integrated into human workflows, augmenting our capabilities rather than simply automating tasks?

The Tyranny of Scale: Biological Constraints on Intelligence

Contemporary artificial intelligence development is frequently characterized by a relentless pursuit of scale – larger datasets, more parameters, and increased computational power – often at the expense of efficiency. This approach contrasts sharply with biological intelligence, which evolved under intense constraints of energy consumption, physical space, and limited sensing/actuation capabilities. Biological brains, for example, operate on the order of 20 watts, while large AI models can require megawatts. Consequently, current AI systems often exhibit brittle behavior and struggle with generalization, lacking the robustness and adaptability inherent in natural systems. The prioritization of scale overlooks the crucial role that these fundamental limitations played in shaping the very architecture and functionality of intelligence, leading to designs that, while powerful in narrow domains, remain far from the efficient and versatile cognition observed in living organisms.

The architecture of biological intelligence isn’t simply a product of evolutionary optimization, but a direct response to inescapable physical constraints. Organisms face rigid limitations on energy consumption – brains are remarkably efficient, operating on roughly 20 watts – and computational power dictated by the speed of neuronal signaling and the density of neural connections. Furthermore, sensing and actuation – how an organism perceives the world and interacts with it – are bound by the laws of physics governing signal transduction and biomechanics. These constraints aren’t merely obstacles, but formative forces; they’ve sculpted the brain’s modularity, sparsity, and reliance on predictive processing. Consequently, biological systems prioritize efficient coding and representation over brute-force computation, a principle often overlooked in current artificial intelligence development where scale frequently overshadows energetic and physical feasibility.

Contemporary artificial intelligence development frequently pursues increasing model size and computational power as primary goals, often overlooking the inherent limitations present in biological systems. This approach results in AI systems demonstrably susceptible to failure outside of narrowly defined parameters – a brittleness stemming from a disregard for fundamental constraints on energy consumption, computational efficiency, and sensory input. The concept of the ‘Constraint Manifold’ encapsulates these limitations – a multi-dimensional space defining the boundaries of feasible intelligence – and AI models operating far from this manifold exhibit a lack of robustness and generalization ability. Essentially, these systems, while capable of impressive feats within their training domain, struggle with novelty and unexpected scenarios because they haven’t evolved – or been engineered – to operate under the same resource pressures that have shaped biological intelligence for billions of years.

The development of truly robust and general artificial intelligence hinges on a careful consideration of biological constraints, and a new framework seeks to measure how closely current AI systems adhere to these limitations through a metric termed ‘Constraint Proximity’. This proximity isn’t simply about matching biological performance, but rather quantifying the degree to which an AI’s architecture and operation align with the fundamental limits imposed by energy consumption, computational resources, and sensory input-the very factors that have shaped biological intelligence. A high Constraint Proximity suggests an AI is operating efficiently within realistic bounds, potentially leading to greater resilience, adaptability, and generalization capabilities, while a low proximity may indicate a brittle system overly reliant on scale and susceptible to unforeseen challenges. By assessing this proximity, researchers can begin to identify critical areas for innovation, moving beyond simply increasing computational power towards designs that mirror the elegant efficiency of natural intelligence.

Intrinsic Motivation: The Pursuit of Predictive Competence

Intrinsic motivation, a fundamental aspect of biological intelligence, diverges from goal-oriented behavior by prioritizing the enhancement of an agent’s ability to accurately predict future outcomes. This drive isn’t focused on reaching specific, pre-defined objectives, but rather on refining the internal model used to anticipate environmental changes and events. The value derived from improved predictive capacity serves as the primary reward signal, influencing exploratory behavior and adaptation. Consequently, agents are motivated to seek information that reduces uncertainty and strengthens their understanding of the world, even in the absence of external rewards or immediate benefits. This predictive processing framework suggests that intelligence is fundamentally about building and refining an internal model of reality, rather than simply reacting to stimuli.

The Learning Progress principle posits that an agent’s intrinsic reward is directly correlated with its ability to detect and compress patterns within its environment, thereby improving its predictive capabilities. This reward mechanism prioritizes the identification of predictable regularities over the pursuit of novelty or unexpected stimuli; an agent is thus driven to seek information that refines its internal model and reduces prediction error. Essentially, the degree of ‘learning progress’ – the quantifiable improvement in the agent’s ability to anticipate future states – serves as the basis for an intrinsic reward signal, guiding exploration towards areas that maximize model accuracy and minimize uncertainty. This contrasts with reward systems solely based on external stimuli or random exploration, offering a more efficient and directed approach to environmental mastery.

The Learning-Progress Intrinsic Reward functions as a motivational signal driving an agent to actively seek information that refines its internal representation of the environment, termed the Model of Embedded Agent. This reward is not tied to achieving specific goals, but rather to the reduction of uncertainty and improvement of predictive accuracy within the model. Consequently, the agent is incentivized to explore states and actions that lead to the most significant gains in its ability to anticipate future outcomes, effectively building a more robust and reliable understanding of its surroundings. This process of iterative model refinement is central to adapting to changing conditions and optimizing performance within the environment.

Plasticity, facilitated by learning progress as an intrinsic reward, allows agents to modify their internal models in response to environmental changes. This continuous adaptation is crucial for thriving in dynamic environments where pre-programmed responses are insufficient. The ability to refine predictive capabilities – through ongoing model adjustments based on experienced discrepancies between prediction and reality – enables agents to maintain performance across varying conditions. This contrasts with rigid systems, and supports robust behavior in unpredictable scenarios where novelty and unexpected events are commonplace. The degree of plasticity determines an agent’s capacity to generalize learned behaviors and successfully navigate unfamiliar situations, effectively ensuring long-term viability.

Empowerment and Efficiency: The Limits of Action

Intelligence, as it pertains to agency, is fundamentally defined by an agent’s ability to affect its environment and achieve desired outcomes – a metric referred to as [latex]Empowerment[/latex]. This concept shifts the focus from raw computational capacity to the efficacy of action; an agent can possess significant processing power without demonstrating intelligence if it cannot translate that power into meaningful changes within its surroundings. [latex]Empowerment[/latex] is therefore not simply a function of internal cognitive abilities, but a measure of the agent’s influence over external states, quantified by the expected change in environmental observability resulting from the agent’s actions. This framing emphasizes that true intelligence is demonstrated through effective interaction and consequential manipulation of the environment, rather than solely through complex internal processing.

Maximizing an agent’s [latex]Channel\,Capacity[/latex]-defined as the rate of successful information transfer between the agent and its environment-is critical for effective operation under resource constraints. This capacity is not absolute; it is fundamentally limited by both available energy and computational resources. Increasing [latex]Channel\,Capacity[/latex] necessitates optimizing the efficiency of information transmission and processing. Agents must therefore prioritize the transfer of salient environmental data while minimizing redundant or irrelevant information to operate within defined energy and compute budgets. Failure to do so results in information bottlenecks, reduced responsiveness, and ultimately, diminished performance in complex environments.

The Information Bottleneck (IB) principle provides a theoretical framework for learning compact representations of input data by maximizing the mutual information between the representation and the target variable, while simultaneously minimizing the mutual information between the representation and the input itself. This process effectively compresses the input data, retaining only information relevant to predicting the desired output, thereby optimizing communication efficiency. Formally, the IB objective seeks to find a representation [latex]Z[/latex] of an input [latex]X[/latex] that maximizes [latex]I(Z;Y)[/latex] (relevance to the target [latex]Y[/latex]) subject to a constraint on [latex]I(X;Z)[/latex] (data compression). By minimizing the dependence on irrelevant input features, the resulting representation requires fewer resources for transmission and processing, enabling efficient information transfer and improved performance in resource-constrained environments.

The application of Minimum Description Length (MDL) principles facilitates effective agent operation under resource constraints by prioritizing compact representations of information. This approach, focused on balancing model complexity and accuracy, directly contributes to [latex]Capability Expansion[/latex] – the agent’s ability to successfully address a wider range of tasks given limited energy and compute. Performance improvements resulting from this methodology are quantitatively assessed by calculating the L2 distance between the agent’s achieved performance and a defined performance frontier, where minimal distance indicates optimal efficiency and maximized capability within the given resource limitations.

Beyond Autonomy: The Rise of Coupled Human-AI Systems

The trajectory of artificial intelligence is increasingly focused on systems designed to work with people, rather than simply replacing them. This shift prioritizes the development of Coupled Human-AI Systems, where the strengths of both entities are combined to achieve outcomes beyond the reach of either alone. Rather than striving for fully autonomous intelligence, the emphasis lies on augmentation – building AI tools that amplify human cognitive and physical capabilities. This collaborative approach acknowledges that many real-world problems require not only complex computation, but also uniquely human skills like nuanced judgment, ethical reasoning, and adaptability to unforeseen circumstances. Consequently, successful AI will likely manifest not as independent agents, but as intelligent partners seamlessly integrated into human workflows, enhancing productivity, creativity, and overall problem-solving capacity.

Successfully integrating artificial intelligence into human workflows demands a critical focus on minimizing interface friction – the inherent difficulties in ensuring AI actions consistently align with intended human goals within the bounds of real-world limitations. This friction isn’t simply a matter of technical bugs; it encompasses mismatches in understanding, delays in response, and the cognitive load imposed when humans must constantly monitor or correct AI behavior. High interface friction can negate the benefits of AI, leading to frustration, decreased efficiency, and ultimately, rejection of the technology. Addressing this requires careful consideration of how AI communicates its reasoning, anticipates human needs, and adapts to dynamic environments, fostering a collaborative dynamic rather than a corrective one. The challenge lies in building systems that not only can perform tasks, but do so in a way that feels intuitive, predictable, and supportive of human agency.

The pursuit of truly effective artificial intelligence isn’t about creating standalone entities, but rather systems where human capability is amplified through collaboration. This amplification, termed Agency Gain, hinges on a delicate balance: simultaneously empowering the AI to act with increasing independence – its ‘agency’ – while relentlessly reducing the challenges of aligning those actions with human intentions and the constraints of the real world. Successfully achieving this balance fosters a synergistic relationship where the AI doesn’t simply perform tasks, but actively extends human capacity, effectively creating a combined intelligence greater than the sum of its parts. This isn’t merely about improving efficiency; it’s about unlocking new possibilities through seamless, intuitive collaboration, allowing humans and AI to tackle complex problems with unprecedented effectiveness.

Evaluating the true potential of coupled human-AI systems demands robust benchmarking beyond isolated task completion. Approaches like the Active Reward Learning and Curiosity-driven exploration framework – known as ARC-AGI – provide a means to assess a system’s ability to generalize skills and acquire new ones within complex, realistic environments. This work introduces a framework designed to maximize a ‘Unification Score’, a metric that functions as a proxy for interface quality by quantifying how seamlessly AI actions align with, and amplify, human intent. A higher Unification Score suggests a reduced cognitive load for the human operator and a more effective synergistic partnership, ultimately indicating a more successful coupled system capable of tackling increasingly intricate challenges.

Beyond Scaling: Towards Embodied and Adaptive Intelligence

Current artificial intelligence often relies on increasing computational power and data to refine existing algorithms, a strategy approaching its limits. The pursuit of general intelligence, however, necessitates a departure from this scaling paradigm. Researchers are beginning to explore fundamentally new architectures inspired by the brain’s organization, focusing on principles like sparse coding, predictive processing, and hierarchical inference. These approaches prioritize efficiency and adaptability over sheer computational brute force, aiming to create systems that can learn and reason with limited resources. This architectural shift isn’t merely about building bigger models; it’s about building smarter ones, capable of abstract thought, common-sense reasoning, and genuine understanding-qualities that remain elusive despite recent advances in deep learning.

The pursuit of truly intelligent machines requires a departure from purely computational approaches and a move towards embodied intelligence. Current AI often operates in simulated environments, lacking the crucial feedback loop provided by physical interaction with a complex, unpredictable world. This embodiment isn’t merely about adding robotic bodies; it’s about grounding AI within the constraints of reality – limitations in energy, materials, and action – which forces the development of efficient and resourceful problem-solving strategies. Furthermore, progress hinges on instilling intrinsic motivation – an internal drive for curiosity, exploration, and learning – rather than relying solely on externally defined rewards. Such systems, driven by inherent curiosity and shaped by real-world consequences, are more likely to exhibit the adaptability, resilience, and genuine understanding characteristic of natural intelligence.

The emergence of true intelligence, it is increasingly understood, isn’t solely a matter of computational power, but a product of dynamic interaction with a physical environment. Biological Agency Development-the process by which living organisms learn, adapt, and thrive through embodied experience-provides a crucial blueprint for future AI. Rather than passively processing data, intelligent systems must actively act within the world, receiving feedback from their actions and refining their internal models accordingly. This necessitates a shift from purely algorithmic approaches to architectures that prioritize sensorimotor loops, intrinsic motivation, and the ability to predict and control outcomes. By replicating the core principles of how organisms develop agency-the capacity for independent action and goal-directed behavior-researchers aim to create AI that isn’t simply ‘smart’ but genuinely capable, exhibiting adaptability, resilience, and a deeper understanding of the world around it.

The development of truly advanced artificial intelligence necessitates a move beyond mere computational power, ultimately yielding systems characterized by adaptability, resilience, and value alignment. These future AI entities won’t simply process information; they will actively engage with and learn from complex, unpredictable environments, demonstrating a robustness currently absent in most AI designs. This inherent adaptability stems from a grounding in physical reality, allowing these systems to overcome unforeseen challenges and recover from failures-a key characteristic of biological intelligence. Crucially, aligning AI with human values isn’t a matter of programming ethics, but rather emerges from the system’s own drive for continued interaction and successful navigation of a shared world, fostering a natural convergence towards cooperative and beneficial outcomes.

The pursuit of artificial agency, as detailed in this program, fundamentally hinges on minimizing redundancy and maximizing the signal within constrained systems. This echoes Edsger W. Dijkstra’s assertion that “Simplicity is prerequisite for reliability.” The framework prioritizes learning progress and efficient resource utilization – a direct application of this principle. Every superfluous computation or unnecessary observation introduces a potential abstraction leak, hindering the agent’s ability to accurately model its environment and effectively pursue its goals. The emphasis on predictive compression and empowerment isn’t merely about achieving functionality; it’s about crafting solutions demonstrably correct by their inherent simplicity and mathematical purity, ensuring a reliable coupling with human intent.

The Road Ahead

The pursuit of artificial agency, as outlined in this work, reveals a fundamental tension. The elegance of predictive compression, and its connection to empowerment, offers a theoretically sound foundation. However, true agency necessitates more than mere prediction; it demands a demonstrable coupling with the messy, inefficient reality of resource constraints. The framework presented is a step towards addressing this, but the question of interface quality remains stubbornly complex. How does one quantify the alignment between an agent’s internal model and the nuanced intent of a human collaborator, beyond simplistic reward signals?

Future investigations should focus less on achieving ever-higher scores in contrived environments, and more on the development of provably efficient algorithms for learning progress within genuinely limited computational budgets. The current emphasis on scale feels…unsustainable. A harmonious solution will not arise from brute force, but from a deeper understanding of the symmetries inherent in the problem space.

Ultimately, the value of this line of inquiry lies not in creating artificial minds that resemble human intelligence, but in formalizing the principles that govern any adaptive system. The study of agency, therefore, is less about building machines, and more about understanding the very nature of adaptation itself-a problem that has occupied philosophers, and mathematicians, for centuries.

Original article: https://arxiv.org/pdf/2602.24100.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/