Can AI Truly Cooperate? A Resilience Test for Artificial Intelligence

Author: Denis Avetisyan


New research reveals a stark contrast in the ability of humans and current AI agents to maintain cooperation when faced with challenging, resource-scarce environments.

The study investigates cooperative resilience by comparing human and large language model (LLM) agents subjected to a sequential curriculum-ranging from $E_1$ to $E_9$-and assesses the impact of communication; LLM agents were further enhanced with a reflection module incorporating both ten fixed and ten dynamically generated questions to facilitate persistent memory accumulation across episodes, while humans and LLMs exchanged information-through ten-second voice messages and text emissions at each timestep, respectively-categorized within six defined classes.
The study investigates cooperative resilience by comparing human and large language model (LLM) agents subjected to a sequential curriculum-ranging from $E_1$ to $E_9$-and assesses the impact of communication; LLM agents were further enhanced with a reflection module incorporating both ten fixed and ten dynamically generated questions to facilitate persistent memory accumulation across episodes, while humans and LLMs exchanged information-through ten-second voice messages and text emissions at each timestep, respectively-categorized within six defined classes.

This review compares human and large language model-based agent performance in multiagent systems addressing social dilemmas and disruptive conditions to evaluate cooperative resilience.

While artificial intelligence increasingly demonstrates proficiency in complex tasks, sustaining cooperative behavior under pressure remains a significant challenge. This is explored in ‘Evaluating Cooperative Resilience in Multiagent Systems: A Comparison Between Humans and LLMs’, which comparatively analyzes the ability of human groups and Large Language Model (LLM)-based agents to collaboratively navigate disruptive scenarios modeled on the Tragedy of the Commons. Results reveal a substantial gap in cooperative resilience, with human communication consistently outperforming even communicative LLM agents in maintaining shared resources under adverse conditions. Can insights from human decision-making in these challenging contexts inform the design of more robust and prosocial artificial intelligence?


The Inherent Tension Between Individual Gain and Collective Welfare

Numerous systems, whether biological communities or human economies, frequently encounter a fundamental tension between individual gain and the overall well-being of the group. This arises because incentives often prioritize immediate, personal benefits, potentially at the expense of long-term collective welfare. Consider a shared grazing land – each herder benefits from adding more animals, but unchecked, this leads to overgrazing and ultimately harms everyone. Similarly, in economic systems, individual companies may prioritize profit maximization, contributing to environmental degradation or social inequality. This disconnect between individual rationality and collective outcomes creates inherent instability, demanding mechanisms to align incentives and foster sustainable practices. The prevalence of such challenges highlights a universal principle: maintaining a thriving collective requires actively addressing the potential for self-interest to undermine shared prosperity.

The predictable depletion of shared resources, exemplified by Garrett Hardin’s “Tragedy of the Commons,” reveals a fundamental challenge to systemic stability. This occurs when individually rational actors, each pursuing their own self-interest, collectively degrade or exhaust a resource available to all. Consider fisheries, forests, or even clean air – each user benefits directly from exploiting the resource, but the cumulative effect of many such actions can lead to ruin for everyone. This dynamic isn’t limited to environmental issues; it manifests in economic systems through overconsumption and debt, and even in social networks through the spread of misinformation. The core issue lies in the disconnect between individual gain and collective cost, creating a feedback loop that incentivizes unsustainable practices until the system reaches a breaking point and collapses under its own weight.

The capacity of a system to maintain functionality and recover from disruptions – termed Cooperative Resilience – is increasingly recognized as fundamental to long-term stability, extending far beyond simple survival. This resilience isn’t solely about a system’s inherent strength, but rather the mechanisms that foster continued cooperation even when individual actors face hardship or incentives to defect. Research indicates that robust cooperative systems exhibit key traits: diversity in response strategies, redundancy in critical functions, and, crucially, established pathways for learning and adaptation following disturbances. Without these elements, even seemingly robust systems are vulnerable to cascading failures, as localized shocks can quickly propagate throughout the network. Ultimately, enhancing Cooperative Resilience requires a shift in focus from maximizing immediate gains to cultivating the capacity for sustained, collective well-being, ensuring the continued provision of essential resources and services in the face of inevitable challenges.

Cooperative resilience, measured across scenarios varying in disruptive events and resource elimination probability, is enhanced by human communication, as indicated by lighter color intensities representing higher resilience values.
Cooperative resilience, measured across scenarios varying in disruptive events and resource elimination probability, is enhanced by human communication, as indicated by lighter color intensities representing higher resilience values.

Modeling Multiagent Systems: A Framework for Emergent Behavior

Multiagent Systems (MAS) provide a computational framework for modeling the interactions of autonomous entities – termed ‘agents’ – within a shared environment. This approach moves beyond analyzing individual components in isolation, instead focusing on emergent behaviors arising from decentralized interactions. In MAS, agents operate with varying degrees of autonomy, perception, and action capabilities, and can be programmed with diverse goals and strategies. The collective behavior of these agents, and how it impacts overall system outcomes, is the primary focus of study. This allows for the investigation of complex phenomena like cooperation, competition, and collective decision-making, where system-level properties are not simply the sum of individual agent behaviors, but rather result from their dynamic interactions. Modeling with MAS is applicable across diverse domains, including economics, biology, robotics, and social sciences.

The ‘Melting Pot 2.0’ simulation suite is a computational platform designed to model resource allocation and consumption among multiple agents. It allows for precise control over key parameters such as resource regeneration rates, agent population size, and individual agent consumption behaviors. This controlled environment enables researchers to systematically investigate the dynamics of the ‘Tragedy of the Commons’ – a scenario where rational self-interest leads to depletion of a shared resource – by manipulating these parameters and observing the resulting collective outcomes. The suite facilitates repeatable experiments and quantitative analysis of agent interactions and resource sustainability under varying conditions, providing a robust framework for studying complex systems.

The simulation investigates agent behavior under conditions of resource scarcity, comparing the strategies employed by Large Language Model-based Agents (LLM Agents) and human participants. LLM Agents operate according to programmed algorithms designed to maximize resource acquisition, while human agents make decisions based on individual preferences and perceived fairness. Data is collected on resource consumption rates, collaborative tendencies, and overall system sustainability to quantify differences in approach. This comparative analysis allows for the examination of how distinct decision-making processes impact the dynamics of shared resource utilization and potential for collective resource depletion.

Heatmaps reveal that both human and LLM agents demonstrate increased cooperative resilience with communication-indicated by lighter colors-across varying numbers of disruptive events and probabilities of resource elimination.
Heatmaps reveal that both human and LLM agents demonstrate increased cooperative resilience with communication-indicated by lighter colors-across varying numbers of disruptive events and probabilities of resource elimination.

LLM-Based Agents: The Significance of Communicative Capacity

The LLM-based agents employed in this research leverage the capabilities of GPT-4 as their core language model. These agents operate within a Generative Agents Architecture, a framework designed to facilitate action selection based on textual inputs. This architecture allows agents to receive prompts, process information, and determine subsequent actions solely through the interpretation of text. Specifically, actions are not pre-programmed or hard-coded; instead, the GPT-4 model generates actions conditioned on the provided textual prompts, enabling a flexible and dynamic response to environmental stimuli and communicative exchanges. This text-conditioned approach distinguishes the agents from traditional AI systems reliant on explicit state machines or rule-based behaviors.

The Observation-to-Text Adapter is a crucial component enabling LLM-based agents to interact with spatial environments. This adapter processes raw spatial data, such as object positions and agent locations, and translates it into natural language descriptions. Specifically, it generates textual inputs detailing the agent’s surroundings, including the presence, location, and characteristics of nearby objects and other agents. This conversion is essential because Large Language Models (LLMs) primarily process textual data; the adapter bridges the gap between the spatial world and the LLM’s linguistic input requirements, allowing the agent to ‘perceive’ and reason about its environment.

Effective coordination between LLM-based agents relies on robust communication achieved through the use of semantic embeddings. These embeddings transform agent observations and intended actions into vector representations, capturing the meaning of information in a numerical format. By comparing the similarity of these vectors, agents can assess the relevance of information and align their strategies. Analysis using t-distributed stochastic neighbor embedding (t-SNE) allows for the visualization of these high-dimensional embeddings, providing insight into the shared understanding and communication patterns that emerge within the multi-agent system, and identifying potential breakdowns in coordination.

To assess the stability and fault tolerance of the multi-agent system, a ‘Disruptive Bot’ was implemented. This agent operates with a deliberately unsustainable strategy – specifically, it prioritizes actions that deplete shared resources without considering long-term consequences or system-wide impact. The Disruptive Bot’s behavior serves as a stress test, allowing researchers to observe how the other agents respond to destabilizing influences and to evaluate the overall resilience of the system against agents pursuing self-destructive or malicious strategies. Data collected from interactions with the Disruptive Bot informs the development of mechanisms for detecting and mitigating potentially harmful agent behaviors.

Analysis of nine resilience scenarios reveals the distribution of message types exchanged between human and LLM agents.
Analysis of nine resilience scenarios reveals the distribution of message types exchanged between human and LLM agents.

Quantifying Resilience: A Metric for Systemic Stability

A quantifiable metric, termed the ‘Cooperative Resilience Score’, was developed to rigorously assess a system’s capacity to endure and recover from external disruptions, specifically modeled as ‘Environmental Shocks’. This score moves beyond simple measures of survival, instead evaluating the sustained availability of critical resources and the preservation of foundational elements within the system over time. By assigning a numerical value to resilience, researchers can compare the performance of different agents – be they human groups or artificial intelligence – under identical stress conditions. The score considers not only immediate responses to shocks, but also the long-term capacity to maintain functionality and prevent catastrophic resource depletion, offering a robust and objective basis for evaluating and enhancing systemic stability.

Research consistently reveals a significant disparity in resilience between human groups and large language model (LLM) agents when facing disruptive conditions. A newly developed ‘Cooperative Resilience Score’ quantifies this difference, demonstrating that human-operated systems consistently outperform their AI counterparts in scenarios simulating resource scarcity and environmental shocks. While LLM agents, even with access to the same information and rules, frequently deplete available resources, human groups exhibit strategic adaptability, maintaining substantially higher levels of both resource availability and long-term preservation – approximately 25% of initial apple availability and 60% of initial tree preservation, respectively. This suggests that uniquely human capabilities, such as nuanced communication, collaborative problem-solving, and an intuitive understanding of long-term consequences, are critical factors in fostering resilience that current AI systems have yet to replicate.

Comparative analysis revealed a significant disparity in resource management between human groups and large language model (LLM) agents when subjected to simulated environmental pressures. While LLM agents consistently depleted available resources – demonstrating zero retention of the initial apple supply and failing to preserve any trees – human participants successfully maintained approximately 25% of the initial apple availability and preserved 60% of the original tree population. This outcome highlights a crucial difference in adaptive strategies; humans, even under disruptive conditions, exhibited behaviors consistent with long-term sustainability, suggesting an inherent capacity for nuanced decision-making and resource conservation that currently eludes even advanced artificial intelligence.

The capacity of a system to endure and recover from disturbances is profoundly linked to the harvesting strategies employed, and a focus on sustainability emerges as crucial for long-term resilience. Research indicates that practices prioritizing resource renewal-such as selective harvesting that allows for regrowth and diversification-significantly outperform exploitative approaches where resources are rapidly depleted. This principle extends beyond ecological systems; analogous strategies in socio-economic contexts, like diversified investment portfolios or rotating leadership, can buffer against unforeseen challenges. By understanding how sustainable harvesting maximizes resource availability over time, insights gained can directly inform real-world resource management, guiding policies that prioritize long-term stability and preventing the catastrophic collapse of vital systems – from fisheries and forests to energy supplies and agricultural lands.

Across nine scenarios, both human conditions-with and without communication-demonstrated similar mean environmental and social metric performance, normalized to maximum possible apple (64) and tree (6) yields.
Across nine scenarios, both human conditions-with and without communication-demonstrated similar mean environmental and social metric performance, normalized to maximum possible apple (64) and tree (6) yields.

Toward More Robust and Adaptive Systems: Future Directions

Current large language model (LLM)-based agents often struggle with the nuances of human interaction, frequently exhibiting logical but socially inept behavior. Researchers are addressing this limitation by incorporating ‘Social Heuristics’ – simplified rules or mental shortcuts derived from observing human social dynamics – directly into the agents’ decision-making processes. This integration allows the agents to move beyond purely rational calculations and consider factors like politeness, reciprocity, and reputation, ultimately improving their ability to navigate complex social situations. By mimicking these intuitive social cues, the agents can foster more effective communication, build trust, and achieve more successful outcomes in collaborative environments, moving closer to truly seamless human-agent interaction.

A deeper understanding of incremental co-construction-how shared meaning emerges step-by-step through communication-is paramount for building truly intelligent systems. This process isn’t simply about transmitting information; it involves active negotiation, clarification, and mutual adaptation between communicating agents. Research suggests that successful co-construction relies on subtle cues-acknowledgements, repairs, and elaborations-that signal understanding or request further detail. By modeling these dynamics, engineers can create agents capable of not only interpreting language, but also of actively participating in a collaborative meaning-making process, leading to more effective teamwork and problem-solving in complex, real-world scenarios. The ability to build shared understanding incrementally will be essential for agents operating in dynamic environments where complete information is rarely available upfront.

The convergence of advanced artificial intelligence with complex societal challenges necessitates the development of systems that are not merely intelligent, but also resilient and adaptable. This research establishes a foundational framework for creating such systems, envisioning applications that extend beyond conventional problem-solving to encompass the nuances of resource allocation, environmental protection, and the promotion of collective well-being. By focusing on the underlying principles of robust decision-making and collaborative understanding, this work aims to facilitate the creation of tools capable of navigating unpredictable circumstances and fostering sustainable solutions for a rapidly changing world. The long-term impact anticipates a shift towards proactive systems that anticipate challenges, optimize resource utilization, and contribute to a more equitable and sustainable future for all.

A t-SNE visualization of semantic embeddings reveals that human and large language model messages occupy distinct regions of the embedding space, as indicated by their respective kernel-density contours.
A t-SNE visualization of semantic embeddings reveals that human and large language model messages occupy distinct regions of the embedding space, as indicated by their respective kernel-density contours.

The study’s findings regarding the disparity between human and LLM-based agent performance in maintaining cooperation under disruptive conditions echo a fundamental tenet of robust system design. As John von Neumann observed, “The sciences do not try to explain why we exist, but how we exist.” This research doesn’t merely document a performance gap; it illuminates how current LLM agents fail to exhibit the nuanced, anticipatory behavior necessary for cooperative resilience, particularly when facing the pressures of a tragedy of the commons scenario. The inability of these agents to consistently prioritize collective welfare suggests a deficiency in their underlying algorithmic structures, demanding a shift towards provably robust strategies rather than empirically derived approximations.

The Road Ahead

The observed disparity in cooperative resilience between human subjects and current large language model-based agents is not merely a quantitative difference, but a qualitative one. The agents, demonstrably, fail to internalize the principles of sustained collective welfare in disruptive environments, relying instead on strategies that, while perhaps locally optimal, are globally self-defeating. This is not a failure of processing power, but of axiomatic foundation. A system can compute the optimal defection strategy with perfect accuracy and still fail to understand the tragedy of the commons. Simplicity, it must be reiterated, does not equate to brevity; it demands non-contradiction and logical completeness.

Future work should therefore not focus on scaling model parameters, but on rigorously defining the underlying principles of cooperative behavior. A provably correct algorithm for maintaining collective resilience, grounded in game theory and social choice, remains elusive. Current approaches, predicated on reinforcement learning within simulated social dilemmas, offer only empirical approximations. The challenge lies in translating intuitive human understanding-a tacit awareness of reciprocal altruism and the long-term consequences of defection-into a formal, verifiable system.

The pursuit of truly cooperative AI necessitates a shift in perspective. It is not enough to build agents that appear to cooperate; they must be compelled, by their internal logic, to prioritize collective welfare, even at a cost to individual reward. Only then can one speak of genuine resilience in the face of inevitable disruption.


Original article: https://arxiv.org/pdf/2512.11689.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-12-16 00:30