When AI Plays to Human Biases

Author: Denis Avetisyan

New research reveals how incorporating behavioral economics into multi-agent AI systems alters strategic interactions and challenges traditional game theory assumptions.

The study demonstrates a competitive dynamic wherein a learning heuristic-assessed by [latex]Q[/latex] values-engages in a one-off contest against an artificial intelligence, revealing the relative performance of each approach in a direct, isolated encounter.

This review examines the impact of Prospect Theory on multi-agent reinforcement learning and its effects on equilibrium outcomes in noncooperative game scenarios.

While traditional game theory assumes rational actors, human strategic behavior is often shaped by cognitive biases and emotional responses. This is addressed in ‘Noncooperative Human-AI Agent Dynamics’, which investigates how incorporating Prospect Theory-a descriptive model of human decision-making exhibiting loss aversion and reference dependence-into multi-agent systems impacts emergent dynamics when competing against standard expected utility maximizing agents. Simulations across various game scenarios reveal that deviations from purely rational behavior can significantly alter strategic interactions and equilibrium outcomes, sometimes corroborating established behavioral anomalies. How might a more nuanced understanding of human preferences inform the design of more robust and predictable human-AI collaborations in complex strategic environments?

Beyond the Rational Actor: Unmasking the Limits of Utility

At the core of classical game theory lies Expected Utility Theory, a framework positing that individuals consistently strive to maximize their perceived value when making decisions. This model operates on the principle that choices aren’t simply about maximizing gains, but rather about weighing potential outcomes against their associated probabilities and the subjective value – or ‘utility’ – assigned to each. Essentially, a rational agent calculates the expected utility of each option – a sum of the utility of each possible outcome multiplied by its probability – and selects the option with the highest calculated value. For decades, this assumption of value maximization served as a foundational building block for predicting behavior in strategic interactions, from economic negotiations to evolutionary biology. However, subsequent research has revealed that human decision-making frequently diverges from this idealized model of rationality, prompting a re-evaluation of its universal applicability and inspiring the development of behavioral game theory.

Behavioral economics has revealed that human decisions consistently diverge from the predictions of perfect rationality, challenging the foundational assumptions of many economic models. Studies demonstrate that individuals are susceptible to cognitive biases – systematic patterns of deviation from normatively rational judgment – such as loss aversion, where the pain of a loss is felt more acutely than the pleasure of an equivalent gain. Furthermore, heuristics, or mental shortcuts, are frequently employed to simplify complex decisions, leading to predictable errors in judgment. These deviations aren’t random; rather, they are ingrained patterns observed across populations, suggesting that humans aren’t simply flawed maximizers of utility, but operate with a bounded rationality, constrained by cognitive limitations and emotional influences. Consequently, traditional models reliant on purely rational actors often fail to accurately predict real-world behavior, prompting the development of more psychologically realistic frameworks.

The persistent gap between theoretical predictions of rational behavior and observed human choices fundamentally weakens the explanatory power of standard game-theoretic models. While these models assume individuals consistently maximize expected utility, behavioral studies reveal systematic biases – loss aversion, framing effects, and hyperbolic discounting, among others – that lead to predictable deviations from this ideal. This isn’t simply random error; rather, these patterns demonstrate that individuals often prioritize factors beyond pure economic gain, such as fairness, social norms, or immediate gratification. Consequently, a more nuanced approach to modeling decision-making is required, one that incorporates these psychological realities and allows for a richer, more accurate understanding of strategic interactions, potentially integrating concepts from bounded rationality and neuroeconomics to refine predictive capabilities.

Framing Reality: The Power of Loss and Relative Value

Prospect Theory diverges from traditional economic models of rational choice by positing that individuals do not assess outcomes based on their absolute values, but rather on gains and losses relative to a specific [latex]ReferencePoint[/latex]. This [latex]ReferencePoint[/latex] typically represents the individual’s current state or an expected outcome; deviations from this point are then evaluated as either gains or losses. The magnitude of these perceived gains and losses, rather than the absolute values, drive decision-making. Consequently, an outcome of +$100 is not perceived as equivalent to a loss of -$100; instead, the negative impact of the loss is generally weighted more heavily. This relative evaluation fundamentally alters how individuals respond to potential outcomes, leading to predictable deviations from expected utility maximization.

Loss aversion, a central concept within prospect theory, describes the empirically observed asymmetry in how individuals value gains and losses. Research consistently demonstrates that the psychological impact of experiencing a loss of a given amount is significantly greater than the positive emotional impact of gaining the same amount. Specifically, studies suggest that losses are typically felt with approximately twice the intensity of equivalent gains; this is often quantified as a steeper negative value function for losses compared to the positive value function for gains. This asymmetry is not simply a cognitive bias but is deeply rooted in evolutionary pressures, where avoiding threats (losses) historically held greater survival importance than acquiring opportunities (gains). The effect manifests across diverse contexts, influencing decision-making under risk and explaining phenomena such as the endowment effect and status quo bias.

Cumulative Prospect Theory builds upon the foundational Prospect Theory by addressing scenarios involving decision-making under risk, where outcomes are not certain. While Prospect Theory primarily focuses on outcomes, Cumulative Prospect Theory incorporates a probability weighting function [latex] w(p) [/latex] which modifies the probabilities associated with each outcome. This function demonstrates that individuals tend to overweigh small probabilities and underweigh large probabilities, leading to deviations from expected utility theory. Specifically, the theory posits that the value function is defined over gains and losses relative to a reference point, and that the overall value is calculated as the sum of the value of each outcome multiplied by its weighted probability. This allows for a more nuanced understanding of risk attitudes, particularly in situations with non-linear probabilities and where the framing of choices significantly impacts decisions.

The ActionChangeRate, which quantifies the frequency with which an agent modifies its strategic behavior, demonstrates a direct correlation with principles derived from Prospect Theory. Specifically, observed data from experimental game theory, notably Ochs’ Game, indicates this rate can be up to 50% higher than in other comparable game scenarios. This increased rate of strategy adjustment is attributed to the amplified psychological impact of potential losses relative to equivalent gains, as posited by loss aversion. Agents in Ochs’ Game, facing a structured payoff matrix, exhibit a heightened sensitivity to avoiding losses, leading to more frequent alterations in their actions as they attempt to mitigate negative outcomes and capitalize on potential gains, even when objectively similar scenarios exist with lower ActionChangeRates.

When Equilibrium Fails: The Pathology of Predictable Instability

Under the framework of Expected Utility Theory, several canonical game theory examples-including the Prisoners Dilemma, Matching Pennies, Battle of the Sexes, Stag Hunt, and Chicken-consistently produce Nash Equilibria. These equilibria represent stable states where no player can improve their outcome by unilaterally changing their strategy, assuming all other players maintain theirs. Specifically, the Prisoners Dilemma typically yields a dominant strategy equilibrium of mutual defection, Matching Pennies results in a mixed strategy equilibrium with equal probabilities, the Battle of the Sexes has two pure strategy Nash Equilibria, Stag Hunt presents both a risk-dominant and payoff-dominant equilibrium, and Chicken results in a mixed strategy equilibrium with varying probabilities of cooperation and defection. The existence of these predictable equilibria is a foundational aspect of traditional game theory analysis when players are assumed to maximize expected utility.

Crawford’s Counterexample and Ochs’ Game illustrate scenarios where the application of Prospect Theory to game-theoretic models results in the non-existence of Nash Equilibria. In these games, players exhibiting loss aversion and weighting functions – core tenets of Prospect Theory – consistently deviate from the mixed strategy equilibria predicted by Expected Utility maximization. Specifically, players prioritize avoiding perceived losses over achieving equivalent gains, leading to strategic behaviors that prevent convergence to a stable equilibrium. Empirical results from these games demonstrate that agents do not consistently play the theoretically predicted mixed strategies, and attempts to reach equilibrium through iterated play fail, indicating a fundamental breakdown of standard rational choice assumptions when coupled with Prospect Theory-based preferences.

Equilibrium pathologies represent deviations from predicted outcomes in game theory when players exhibit behaviors inconsistent with the axioms of expected utility maximization. Specifically, these pathologies arise in games like Crawford’s Counterexample and Ochs’ Game where players, when modeled with Prospect Theory rather than expected utility, consistently fail to converge on Nash Equilibria-stable states where no player has an incentive to deviate. This indicates a breakdown of traditional rational behavior, as defined by game-theoretic models, and demonstrates that the existence of a Nash Equilibrium does not guarantee its emergence in practice when behavioral factors are considered. These pathologies highlight the limitations of relying solely on expected utility as a descriptive model of human decision-making in strategic interactions.

In strategic interactions exhibiting equilibrium pathologies, the StateHistory – a record of all prior actions taken by players – fundamentally alters the decision-making process. Unlike traditional game theory which assumes forward-looking rationality based solely on payoffs, players in these scenarios demonstrably weight past actions when formulating current strategies. This historical dependence means the strategic landscape is not static; it evolves with each round of play, creating a path-dependent dynamic where initial choices can significantly constrain or expand future options. Consequently, predicting outcomes requires modeling not just payoff structures, but also the cognitive biases that cause players to incorporate and react to the accumulated StateHistory, rendering standard Nash equilibrium calculations unreliable.

After 5,000 steps of training, the One Off Ochs’ game policies demonstrate varying performance across different matchups.

Beyond Prediction: Modeling Adaptive Behavior with Reinforcement Learning

MultiAgentReinforcementLearning offers a robust methodology for dissecting the dynamics of interactive systems, moving beyond the limitations of single-agent analyses. This framework allows researchers to model scenarios where numerous independent entities – whether robotic systems, economic actors, or even animal populations – learn to optimize their behavior through trial and error within a shared environment. Unlike traditional game theory, which often relies on assumptions of perfect rationality and complete information, this approach embraces the complexities of real-world interactions, permitting agents to adapt to changing circumstances and the actions of others. By defining environments as [latex]Markov\,Games[/latex], researchers can simulate intricate scenarios and observe emergent behaviors, providing valuable insights into cooperation, competition, and the evolution of strategies in complex systems. The power of this approach lies in its ability to capture the nuances of decentralized decision-making, offering a pathway to understanding and predicting collective outcomes.

The capacity for agents to learn effective strategies even when a stable, predictable outcome isn’t guaranteed represents a significant advancement in modeling complex systems. Traditional game theory often relies on the concept of a Nash Equilibrium – a point where no player benefits from unilaterally changing their strategy – but many real-world scenarios lack such a clear solution. Utilizing algorithms like Q-learning, agents can navigate these uncertain environments through trial and error, iteratively refining their actions based on received rewards. This approach doesn’t require a pre-defined optimal solution; instead, agents develop policies that maximize cumulative reward over time, effectively ‘learning’ to succeed even in the absence of a predictable equilibrium. The result is a more robust and realistic simulation of strategic interactions, capturing behaviors that deviate from strict rationality and mirroring the adaptive strategies observed in natural and social systems.

Simulations leveraging reinforcement learning reveal how Prospect Theory – the idea that individuals feel the pain of a loss more acutely than the pleasure of an equivalent gain – fundamentally alters strategic choices. Rather than converging on classically predicted Nash Equilibrium outcomes, agents consistently exhibit behavioral deviations, prioritizing risk aversion and loss avoidance. In repeated iterations of games like Chicken and Ochs’ Game, policy convergences were observed at approximately (0.8, 0.8) and (0.5, 0.2) respectively, signifying a consistent preference for strategies that minimize potential losses, even if it means foregoing potentially larger gains. These results demonstrate that incorporating psychologically-plausible biases into agent models offers a more realistic depiction of decision-making in strategic environments, moving beyond the assumptions of purely rational actors.

The foundation for simulating complex interactions between multiple agents rests upon the robust structure of the MarkovGame framework. This system moves beyond traditional Markov Decision Processes by explicitly incorporating [latex]StateHistory[/latex], allowing each agent’s decision-making process to be informed not just by the current environmental state, but by the entire sequence of past states and actions. This historical context is crucial for modeling realistic behavioral patterns, particularly in scenarios where strategic considerations and reputation building are paramount. By tracking this history, the framework accurately captures the evolving dynamics of multi-agent systems, enabling researchers to investigate how agents adapt their strategies over time and ultimately converge – or fail to converge – on stable outcomes, even in the absence of a predefined equilibrium.

Anomalous learning agents (players 1 and 2) exhibit fluctuating Q-values and exponential moving average (EMA) of Q-value changes, indicating unstable learning behavior.

The study illuminates how agents, even artificial ones, are rarely purely rational actors, a departure from traditional game theory assumptions. This echoes John von Neumann’s observation: “The sciences do not try to explain why we exist, but how we exist.” The incorporation of Prospect Theory-modeling loss aversion and reference points-reveals that strategic interactions aren’t simply about maximizing gains, but also minimizing perceived losses. Consequently, equilibrium outcomes shift, demonstrating that understanding the how of decision-making-the underlying mechanisms-is crucial. The research effectively exploits comprehension by revealing how deviations from perfect rationality fundamentally alter multi-agent dynamics, effectively reverse-engineering the complexities of strategic behavior.

What’s Next?

The assertion that agents, even artificial ones, operate from fixed rational foundations has begun to fracture. This work demonstrates that injecting behavioral biases-specifically, the asymmetries inherent in Prospect Theory-doesn’t simply add noise to multi-agent systems; it fundamentally alters the landscape of strategic interaction. The observed deviations from Nash Equilibrium aren’t glitches, but rather predictable consequences of modeling agents that, like their biological counterparts, are loss-averse and reference-dependent. A bug is the system confessing its design sins.

However, the current framework remains largely confined to relatively simple game scenarios. The true test lies in scaling these models to encompass the messiness of real-world interactions-environments characterized by incomplete information, asynchronous action, and a proliferation of agents, each with potentially unique behavioral profiles. Can a coherent equilibrium even be defined in such a system, or is the pursuit of optimality a fundamentally misguided endeavor?

Future research should also investigate the interplay between different behavioral biases, and explore whether agents can learn to exploit-or even manipulate-the biases of others. The goal isn’t to create perfectly rational agents, but to understand the dynamics of irrationality. For within those deviations lie the keys to predicting-and perhaps even controlling-complex adaptive systems.

Original article: https://arxiv.org/pdf/2603.16916.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

Beyond the Rational Actor: Unmasking the Limits of Utility

Framing Reality: The Power of Loss and Relative Value

When Equilibrium Fails: The Pathology of Predictable Instability

Beyond Prediction: Modeling Adaptive Behavior with Reinforcement Learning

What’s Next?

See also: