Robots That Remember: Scaling Collaboration with Shared Experience

Author: Denis Avetisyan

A new framework enables teams of robots to learn from past tasks, dramatically improving efficiency and reducing the need for constant re-planning.

The system operates through a layered planning process, initially guided by a large language model →, then extended through specialized modules [latex] \Rightarrow [/latex] to refine and expand upon the initial strategic framework.

MeCo leverages task similarity and large language models to cache and reuse successful plans in multi-robot systems.

While multi-robot systems offer gains in efficiency, most collaboration methods require extensive task-specific training, hindering adaptability-a limitation that recent work leveraging large language models (LLMs) aims to overcome. This paper introduces ‘MeCo: Enhancing LLM-Empowered Multi-Robot Collaboration via Similar Task Memoization’, a novel framework that significantly reduces computational cost by reusing solutions to similar tasks instead of replanning from scratch. MeCo achieves this through a new similarity testing method and a task memoization approach, minimizing redundant LLM invocations and improving success rates. Could this cache-and-reuse strategy unlock truly scalable and flexible multi-robot collaboration in dynamic, real-world environments?

The Inevitable Bottleneck of Naive Coordination

Current large language model (LLM)-based approaches to multi-robot coordination, such as ReAct and HMAS-2, face significant hurdles when applied to genuinely complex tasks. While promising in principle, these methods are computationally expensive, requiring substantial processing power to reason through even moderately intricate scenarios. This limitation stems from the depth of reasoning required to effectively plan for multiple robots operating simultaneously, where the state space grows exponentially with each additional agent. The inherent sequential nature of LLM processing further exacerbates these issues, creating bottlenecks that hinder real-time responsiveness and scalability. Consequently, applying these techniques to dynamic, real-world environments with numerous robots proves challenging, often resulting in slow performance and an inability to adapt to unforeseen circumstances.

Centralized planning, while conceptually straightforward for coordinating multiple robots, quickly encounters limitations as the number of robots or the complexity of the environment increases. These systems rely on a single, central controller to compute optimal plans for all robots, demanding exponentially more computational resources with each added unit or variable. This creates a significant bottleneck, hindering real-time responsiveness and preventing adaptation to dynamic changes – a crucial requirement for tasks like search and rescue or disaster response. The single point of failure inherent in centralized architectures also reduces the robustness of the system; if the central controller fails, the entire multi-robot team is effectively immobilized. Consequently, while offering a clear organizational structure, centralized planning methods prove impractical for scaling to the larger, more unpredictable environments where multi-robot collaboration truly offers a substantial advantage.

Effective multi-robot coordination hinges on streamlined planning processes that reduce dependence on the substantial computational demands of large language model (LLM) reasoning. While LLMs offer promising avenues for task allocation and problem-solving, their inherent complexity quickly becomes a limitation as the number of robots increases and environmental dynamics shift. Researchers are therefore exploring alternative approaches-such as decentralized planning and task decomposition-that prioritize efficient algorithms and minimize the need for extensive, real-time LLM inferences. This shift aims to create scalable systems where robots can adapt to unforeseen circumstances and collaborate effectively without being constrained by computational bottlenecks, ultimately unlocking the potential for robust and adaptable multi-robot teams in real-world applications.

S-Planner coordinates tasks between robots 'Alice' and 'Bob' by decomposing high-level goals and planning low-level motions towards shared target positions [latex]\mathbf{x}[/latex] indicated by red points, following a reference trajectory shown in blue. — S-Planner coordinates tasks between robots ‘Alice’ and ‘Bob’ by decomposing high-level goals and planning low-level motions towards shared target positions [latex]\mathbf{x}[/latex] indicated by red points, following a reference trajectory shown in blue.

The Echo of Past Success: MeCo’s Core Principle

MeCo functions as a planning framework designed to improve efficiency by capitalizing on the commonalities between different tasks. Instead of generating plans from scratch for each new scenario, MeCo maintains a repository of previously successful plans and assesses the similarity of the current task to those stored solutions. This assessment utilizes defined metrics to identify tasks with shared characteristics. When a sufficiently similar task is identified, MeCo reuses the associated plan, significantly reducing the computational burden and time required for planning. This approach minimizes the need for repeated complex calculations and leverages existing knowledge to expedite the planning process, particularly in scenarios involving a high degree of task repetition or predictable variations.

Task Similarity Testing within the MeCo framework categorizes new tasks by quantifying their resemblance to previously completed tasks, primarily through the assessment of Workspace Overlap. This overlap is determined by calculating the intersection of the task’s required workspace with the workspaces of existing tasks in the system’s memory. A higher degree of workspace intersection indicates a stronger similarity, suggesting that plans generated for the original tasks may be adaptable for the new task. The system employs a similarity threshold; tasks exceeding this threshold are considered sufficiently similar to leverage cached plans, while those falling below initiate new planning sequences. This quantitative approach enables efficient reuse of prior knowledge without requiring explicit task labeling or semantic understanding.

The Similar Motion Planner (S-Planner) within MeCo functions by retrieving and adapting previously computed plans from a solution cache. Prior to invoking the Language Model (LLM) for plan generation, the S-Planner assesses the current task’s similarity to those in the cache. If a sufficiently similar task is identified, the corresponding plan is reused as a starting point, significantly reducing the computational demands on the LLM. This cached solution is then adapted to the specifics of the current task through techniques such as trajectory modification or goal adjustment. The efficiency gained by referencing cached solutions directly correlates to a reduction in LLM call frequency, thereby minimizing planning time and resource utilization.

The Continuous Planning Module within MeCo provides a fallback mechanism to ensure robust plan generation when the Similar Motion Planner (S-Planner) cannot produce a valid solution. This module seamlessly integrates with the S-Planner; upon failure of the S-Planner to find a suitable plan – typically due to significant deviations from cached solutions – control is automatically transferred to the Continuous Planning Module. This module then leverages Large Language Models (LLMs) to resume the planning process from the current state, effectively continuing the task without requiring a complete restart. This handoff is designed to be transparent, maintaining planning continuity and preventing interruptions, and guarantees a solution is pursued even in scenarios where similarity-based planning is insufficient.

MeCo demonstrates robust performance across diverse manipulation scenarios-random, similar, and dissimilar-outperforming baseline methods in terms of success rate, planning time, and token consumption, as evaluated on the MeCoBench benchmark with 30 random seeds (detailed results for Sweep Floor and Arrange Cabinet are in Appendix C.1).

Empirical Confirmation: Performance on MeCoBench

MeCoBench is a novel benchmark environment built as an extension of the RoCoBench framework, created to specifically assess the efficacy of similarity-aware multi-robot collaboration systems. While RoCoBench provided a foundation for evaluating collaborative task completion, MeCoBench introduces scenarios and metrics designed to isolate and quantify the benefits of leveraging task similarity for improved performance. This includes a focus on evaluating how effectively systems can reuse previously computed plans or adapt them to new, but related, tasks. The benchmark incorporates a suite of household manipulation tasks, allowing for rigorous testing of planning time, token consumption, and overall success rates in multi-robot collaborative settings, and provides a standardized platform for comparing different similarity-aware approaches.

Experimental results demonstrate that the MeCo framework substantially reduces computational demands compared to the RoCo baseline. Specifically, planning time was reduced by up to 78% when executing tasks such as “Make Sandwich”. This improvement is attributed to MeCo’s similarity-aware approach, which allows for the efficient reuse of previously computed plans and minimizes redundant computations. The observed reductions in planning time directly translate to faster task completion and improved overall system efficiency in multi-robot collaboration scenarios.

Evaluations demonstrate that MeCo achieves a significantly improved Success Rate when addressing complex multi-robot tasks, exhibiting an approximate 29-32% performance gain in both random and similar scenario configurations when contrasted with baseline methods. This improvement is directly attributable to MeCo’s capability to effectively reuse previously computed plans; by leveraging past solutions, the system minimizes redundant planning efforts and maintains robust performance across diverse task instantiations, indicating enhanced efficiency and adaptability in collaborative robotic systems.

Selective caching within the MeCo framework significantly reduces token consumption during multi-robot task execution. Experiments on the Move Rope, Pack Grocery, and Make Sandwich tasks demonstrated reductions of up to 90% in token usage compared to methods without caching. This performance is achieved by storing and reusing previously computed plans, but the system incorporates a strategy to balance cache size with the need to accommodate task diversity, preventing the cache from being dominated by plans for only a limited set of scenarios. This balancing ensures continued efficiency across a wider range of collaborative tasks.

MeCo demonstrates robust performance on the Sweep Floor and Arrange Cabinet tasks within MeCoBench-achieving consistent success rates, planning time, and token consumption across random (S1) and highly similar (S2) scenarios, as benchmarked against RoCoMandi et al. (2024).

The Inevitable Trajectory: Towards Robust Robotic Ecosystems

The development of MeCo signifies a crucial advancement in the field of multi-robot systems, addressing a long-standing challenge: achieving scalable collaboration amidst real-world complexity. Traditional approaches often struggle as the number of robots increases, due to exponential growth in computational demands and communication overhead. MeCo circumvents these limitations by enabling robots to efficiently recognize similarities between tasks and, crucially, to reuse successful plans from peers. This innovative methodology doesn’t simply distribute work; it fosters a collaborative intelligence where robots learn from each other in real-time, allowing the system as a whole to adapt and respond effectively to dynamic environments and unforeseen circumstances. The result is a paradigm shift towards genuinely scalable robotic teams capable of tackling intricate problems previously beyond their reach, opening possibilities for applications ranging from large-scale logistics and environmental monitoring to complex search and rescue operations.

The development of MeCo signifies a shift towards more practical multi-robot systems, largely due to its diminished dependence on large language model (LLM) reasoning. Traditional approaches often necessitate substantial computational resources, hindering deployment on robots with limited processing power or battery life. By prioritizing efficient task representation and plan reuse, MeCo circumvents the need for complex, real-time LLM inferences. This allows for the implementation of collaborative robotic systems on smaller, more affordable platforms – potentially expanding applications to areas like environmental monitoring, precision agriculture, and large-scale warehouse automation where resource constraints are a significant barrier. The reduction in computational load not only broadens hardware compatibility but also contributes to lower energy consumption and increased operational longevity for multi-robot deployments.

The core concepts underpinning MeCo – identifying task similarities and intelligently reusing successful plans – extend far beyond multi-robot collaboration. These principles offer a powerful framework for advancing a broad spectrum of robotic applications, notably in autonomous navigation and complex manipulation tasks. Rather than requiring robots to repeatedly solve problems from scratch, a system built on task similarity allows for the efficient retrieval and adaptation of previously successful strategies. This approach dramatically reduces computational demands and accelerates learning, enabling robots to operate more effectively in unpredictable environments. By recognizing that seemingly distinct tasks often share underlying structures, robots can leverage past experiences to achieve new goals with increased speed and robustness, representing a significant step toward more adaptable and intelligent robotic systems.

Ongoing research aims to refine the efficiency of MeCo, concentrating on algorithmic improvements and streamlined data handling to accelerate its operational speed and broaden its applicability. A central component of this development involves investigating methods for automated discovery of task similarity; rather than relying on pre-defined metrics, the system will explore techniques to independently learn what constitutes a similar task based on observed data and outcomes. This adaptive approach promises to significantly enhance the robustness and scalability of multi-robot collaboration, allowing the system to generalize effectively to novel situations and environments without extensive manual recalibration or intervention – ultimately paving the way for more autonomous and versatile robotic teams.

Increasing the number of cached tasks impacts MeCo's performance, as evidenced by changes in average success rate, planning time, and token consumption across various tasks, evaluated over 30 random seeds. — Increasing the number of cached tasks impacts MeCo’s performance, as evidenced by changes in average success rate, planning time, and token consumption across various tasks, evaluated over 30 random seeds.

The pursuit of seamless multi-robot collaboration, as detailed in this work, echoes a fundamental truth: systems are not built, they evolve. MeCo’s approach to task memoization, cleverly reducing redundant LLM calls through similarity identification, isn’t about constructing a perfectly orchestrated team, but cultivating an environment where past experiences inform present actions. It acknowledges that order is merely a temporary reprieve-a cache between inevitable disruptions. As Barbara Liskov observes, “Programs must be right first before they are fast.” This sentiment applies perfectly; efficiency gains through task caching are inconsequential if the underlying collaborative framework isn’t fundamentally sound. The system must survive the chaos, learning and adapting through shared experience.

What Lies Ahead?

The pursuit of efficient multi-robot collaboration, as exemplified by frameworks like MeCo, reveals a fundamental truth: optimization is merely the postponement of inevitable complexity. Task memoization, while effective in the short term, operates under the assumption that the future will resemble the past-a prophecy always destined for revision. The system inevitably encounters novel situations, forcing a re-engagement with the full computational cost of large language model inference. Monitoring these failures isn’t about preventing them; it’s the art of fearing consciously.

Future work must move beyond the caching of successful plans. The true challenge isn’t minimizing LLM calls, but understanding the limits of what can be planned. A more fruitful avenue lies in designing robots that gracefully degrade, that can reason about their own uncertainty, and that can collaboratively redefine tasks during execution. This requires a shift from brittle, pre-defined solutions to systems capable of continuous adaptation and emergent behavior.

Resilience doesn’t emerge from anticipating every failure mode, but from accepting that complete certainty is an illusion. True resilience begins where certainty ends. The focus should therefore shift towards building robotic ecosystems, not architectures – systems that can absorb shocks, evolve, and, crucially, reveal the inherent unpredictability of the world they inhabit.

Original article: https://arxiv.org/pdf/2601.20577.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Inevitable Bottleneck of Naive Coordination

The Echo of Past Success: MeCo’s Core Principle

Empirical Confirmation: Performance on MeCoBench

The Inevitable Trajectory: Towards Robust Robotic Ecosystems

What Lies Ahead?

See also: