The Limits of Collaboration: When Adding Agents Doesn’t Help

Author: Denis Avetisyan

New research reveals that simply increasing the number of AI agents doesn’t guarantee improved performance, and highlights the crucial role of task design and coordination.

Agentic systems demonstrate that collaborative architectures-particularly centralized and hybrid coordination- amplify the benefits of increasing model intelligence, consistently outperforming single-agent baselines and revealing that gains in capability are not solely attributable to individual model scaling, as evidenced by performance deltas reaching +8.7% across benchmarks.

Effective multi-agent systems are constrained by task decomposability, coordination complexity, and the propagation of errors, limiting the benefits of scaling beyond moderate team sizes.

Despite the increasing prevalence of multi-agent systems powered by large language models, a principled understanding of their scaling behavior remains elusive. This research, ‘Towards a Science of Scaling Agent Systems’, systematically investigates the interplay between task structure, coordination complexity, and performance in these agentic systems. Our findings reveal that gains from multi-agent coordination diminish beyond moderate team sizes and are critically dependent on matching coordination topology to inherent task properties-with centralized approaches excelling at parallelizable reasoning and decentralized strategies better suited for dynamic navigation. Can these insights pave the way for a predictive framework guiding the design and deployment of scalable, high-performing agent systems across diverse applications?

Deconstructing Intelligence: The Limits of Singular Systems

Conventional language models, while proficient at tasks like text completion and translation, often falter when confronted with problems requiring prolonged, step-by-step reasoning or interaction with dynamic environments. These models typically process information in a static fashion, lacking the ability to maintain internal states or adapt to changing circumstances – crucial elements for tackling complex challenges. For instance, a task demanding strategic planning, such as navigating a virtual maze or managing resources in a simulated economy, exposes the limitations of these systems. The absence of a persistent memory and the inability to learn from iterative feedback hinder their performance, leading to errors and inefficiencies. Consequently, there is a growing need for alternative approaches that can overcome these constraints and enable more robust and adaptable artificial intelligence.

The limitations of traditional language models when faced with intricate, real-world problems are being addressed by a shift towards Multi-Agent Systems (MAS). Rather than relying on a single, monolithic model, MAS harness the power of collaboration, distributing tasks among multiple agents capable of independent reasoning and action. This approach mimics complex natural systems and demonstrates significant potential for solving problems requiring sustained interaction and adaptation. Recent studies have shown that the performance of these systems scales predictably with the number of agents, exhibiting a strong correlation-an $R^2$ value of 0.89 was achieved on previously unseen data-suggesting that increasing the scale of MAS can reliably lead to improved capabilities and more robust solutions.

The practical implementation of multi-agent systems relies heavily on how effectively individual agents coordinate their actions and how comprehensively a complex task is broken down into manageable sub-components. Successful task decomposition isn’t merely about division of labor; it requires a nuanced understanding of each agent’s capabilities and limitations, ensuring optimal allocation of resources and minimizing redundant effort. Coordination strategies, ranging from centralized planning to decentralized negotiation, dictate how agents share information, resolve conflicts, and synchronize their activities. The choice of strategy profoundly impacts the system’s robustness, scalability, and ability to adapt to dynamic environments; poorly coordinated agents can quickly devolve into chaos, while overly rigid structures may stifle innovation and limit overall performance. Consequently, research focuses on developing adaptive coordination mechanisms and intelligent task allocation algorithms that enable multi-agent systems to tackle increasingly complex challenges with greater efficiency and resilience.

Comparative analysis across four benchmarks demonstrates that multi-agent system performance is highly task-dependent, ranging from significant improvements in parallelizable tasks like Finance Agent (+80.9%) to substantial degradation in sequential reasoning tasks like PlanCraft (-70%), highlighting the importance of task structure and orchestration costs.

Architecting Collaboration: Topologies of Control

The coordination topology of a multi-agent system (MAS) defines the communication pathways between agents and fundamentally impacts system performance. A fully centralized topology features a single agent responsible for all decision-making and communication, offering strong control but introducing a single point of failure and potential bottlenecks. In contrast, a fully decentralized topology allows each agent to interact directly with others without a central authority, increasing robustness and scalability but potentially leading to communication overhead and conflicting decisions. Intermediate topologies, such as hierarchical or clustered arrangements, attempt to balance these trade-offs. The optimal topology is dependent on the specific application, considering factors such as agent density, communication bandwidth, and the required level of coordination. Performance metrics directly affected by the coordination topology include convergence time, robustness to agent failure, and overall system efficiency.

Independent coordination in multi-agent systems (MAS) allows each agent to make decisions and act without direct reliance on, or communication with, other agents. While this approach fosters robustness and scalability by eliminating single points of failure and communication bottlenecks, it frequently results in inefficiencies when addressing complex scenarios. These inefficiencies stem from duplicated effort, potential conflicts arising from uncoordinated actions, and the inability to leverage collective knowledge or resources. Specifically, agents may independently solve identical subproblems or pursue conflicting goals, leading to suboptimal overall system performance. The computational cost of redundant actions and the potential for error propagation can significantly increase with the complexity of the environment and the number of agents involved.

Coordination topologies in multi-agent systems (MAS) present a fundamental trade-off between control and flexibility; centralized coordination, typically implemented via a single coordinating agent or a master-slave architecture, provides high control and predictability but introduces a single point of failure and potential bottlenecks. Conversely, decentralized coordination allows agents to operate independently, increasing robustness and scalability, but can lead to inefficiencies due to a lack of global optimization. Hybrid coordination topologies seek to mitigate these limitations by combining elements of both approaches, often employing a tiered or clustered architecture where local groups of agents coordinate amongst themselves while occasionally synchronizing with a higher-level coordinating entity; this allows for both localized responsiveness and a degree of global awareness and control.

Analysis of cost and performance across different large language models and multi-agent architectures reveals a family-dependent scaling law where OpenAI models benefit from centralized coordination, Google models exhibit diminishing returns from lightweight coordination, and Anthropic models demonstrate sensitivity to coordination overhead, collectively highlighting the importance of aligning coordination structure with model family for optimal economic efficiency and emergent performance.

The Scaling Principle: Beyond Linear Expectations

The performance of Multi-Agent Systems (MAS) does not scale linearly with increases in system characteristics; instead, a ‘Scaling Principle’ dictates that interactions between these characteristics determine overall outcomes. This principle implies that simply adding more agents or increasing individual agent capabilities does not guarantee proportional improvements in system performance. The observed relationship is more complex, potentially exhibiting accelerating or decelerating returns depending on the interplay of factors like domain complexity, the number of tools available, and the agents’ modeling capabilities. Therefore, evaluating MAS performance requires considering these interactions rather than treating performance as a simple summation of individual component contributions.

The effectiveness of various coordination topologies within multi-agent systems (MAS) is demonstrably affected by three key system characteristics: Domain Complexity, ToolCount, and ModelCapability. Higher Domain Complexity, representing the intricacy of the problem space, necessitates more robust coordination strategies to manage increased interdependence between agents. Similarly, a larger ToolCount – the number of distinct tools available to agents – introduces combinatorial challenges in task allocation and requires topologies capable of efficiently distributing tool usage. Finally, an agent’s ModelCapability, reflecting its ability to accurately model the environment and other agents, directly influences the precision with which coordination plans can be executed; lower ModelCapability demands more resilient and adaptable topologies to mitigate errors arising from imperfect information.

Analysis of multi-agent system (MAS) performance indicates that increasing the number of reasoning turns ($TurnCount$) does not yield linear improvements in coordination efficiency. Observed data reveals a scaling exponent of 1.724 for reasoning turn count, suggesting that performance degradation accelerates disproportionately with each additional turn. Furthermore, domain complexity exhibits a statistically significant negative correlation with performance, as demonstrated by a coefficient of -0.114 (p < 0.002); this indicates that increasing domain complexity reduces system performance with each additional reasoning turn, and that inefficiencies in coordination are amplified in more complex environments.

Performance comparisons between Gemini-2.0 Flash and Gemini-2.5 Pro reveal that while both models initially benefit from multi-agent coordination, the optimal number of agents for peak performance diverges due to differences in architecture and coordination strategy, ultimately demonstrating that scaling benefits are limited by coordination overhead.

Unlocking Efficiency: The Value of Shared Understanding

CoordinationEfficiency serves as a pivotal benchmark in multi-agent systems, quantifying how effectively agents synchronize actions to achieve collective goals. This metric isn’t merely a measure of operational smoothness; it directly dictates EconomicEfficiency, representing the system’s ability to maximize resource utilization and minimize wasted effort. A system exhibiting high CoordinationEfficiency reduces redundant actions, communication overhead, and potential conflicts, translating into substantial gains in overall productivity and cost-effectiveness. Consequently, improvements in coordination protocols – such as optimized signaling or decentralized decision-making – yield quantifiable economic benefits, making CoordinationEfficiency a central focus for researchers striving to build practical and scalable agent-based solutions. The relationship is often modeled as a function where $EconomicEfficiency = f(CoordinationEfficiency, ResourceCost)$, emphasizing that even with abundant resources, poor coordination can negate potential gains.

The success of multi-agent systems hinges not simply on agents acting, but on how they share and utilize information during interaction. Maximizing InformationGain – the reduction of uncertainty through communication – is therefore central to efficient coordination. Each interaction presents an opportunity to refine an agent’s understanding of the environment and the intentions of others, thereby decreasing redundant actions and increasing the likelihood of achieving collective goals. This principle suggests that effective communication protocols aren’t merely about transmitting data, but about strategically conveying information that yields the greatest reduction in overall system entropy. Consequently, research focuses on designing agents capable of discerning the value of information before sharing it, prioritizing exchanges that offer the most substantial benefit to the collective, and ultimately, amplifying the utility derived from every interaction, leading to substantial gains in overall system performance, measured by metrics like economic efficiency and task completion rates.

The successful integration of multi-agent systems (MAS) into real-world applications hinges on achieving practical scalability and efficiency. While theoretical models demonstrate potential, translating these into functioning systems requires careful optimization of coordination and information gain. Across diverse fields – from robotic swarms and smart grids to distributed sensor networks and logistical optimization – the ability of agents to collaborate effectively while minimizing redundant information exchange is paramount. Without this optimization, computational costs can quickly escalate, hindering performance and limiting the size and complexity of deployable systems. Consequently, research focused on streamlining agent interactions and maximizing the utility of shared data is not merely an academic pursuit, but a fundamental requirement for unlocking the transformative potential of MAS in addressing complex, real-world challenges.

This table compares the architectural complexities of different agent methods, evaluating them based on LLM calls, coordination overhead, and potential for parallelization.

The pursuit of scaling agent systems, as detailed in the research, mirrors a fundamental principle of reverse engineering. The study highlights how diminishing returns emerge beyond moderate team sizes, demonstrating that simply adding more agents doesn’t guarantee proportional improvement-a system’s architecture and the task’s inherent decomposability become critical constraints. This echoes Blaise Pascal’s observation: “The eloquence of angels is no more than the murmur of an instrument.” Just as angelic eloquence isn’t inherently valuable without a structured expression, increasing the number of agents-the ‘instruments’-doesn’t automatically yield a more effective system. The research underscores that true advancement requires understanding the underlying ‘instrument’s’ limitations and coordinating them in a meaningful topology to address error propagation and coordination complexity.

What’s Next?

The observed limits to scaling in multi-agent systems aren’t simply engineering hurdles; they represent fundamental constraints on distributed cognition. The research suggests that beyond a certain point, increasing agents doesn’t yield proportional gains, and that error propagation becomes the dominant factor. It is tempting to view this as a call for ‘better’ agents, but that misses the point. The system isn’t failing because the individual components are weak; it’s failing because the structure of interaction doesn’t support increased complexity.

Future work must move beyond benchmarking raw performance and focus on characterizing the relationship between task decomposability and optimal coordination topologies. It will be crucial to develop analytical tools that predict scaling behavior a priori, allowing for the design of agent systems tailored to specific problem structures. The best hack is understanding why it worked, and every patch is a philosophical confession of imperfection.

Ultimately, the field needs to embrace a more critical perspective. The pursuit of ever-larger agent swarms feels increasingly like a technological echo of old managerial philosophies – throwing more bodies at a problem rarely solves it. Instead, the real challenge lies in understanding how to create genuinely intelligent distributed systems, not just bigger ones.

Original article: https://arxiv.org/pdf/2512.08296.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

Deconstructing Intelligence: The Limits of Singular Systems

Architecting Collaboration: Topologies of Control

The Scaling Principle: Beyond Linear Expectations

Unlocking Efficiency: The Value of Shared Understanding

What’s Next?

See also: