The Rise of Algorithm Teams: AI Designs AI for Complex Problems

Author: Denis Avetisyan


A new framework harnesses the power of large language models to simultaneously optimize both problem-solving algorithms and the prompts that guide them, leading to enhanced performance on challenging optimization tasks.

The framework integrates swarm intelligence optimization-illustrated by algorithms like the Fireworks Algorithm-with large language model-driven prompt template evolution, fostering a co-evolutionary process where iterative algorithmic refinement and automated prompt generation dynamically interact, potentially maintaining multiple distinct prompt template pools to enhance performance.
The framework integrates swarm intelligence optimization-illustrated by algorithms like the Fireworks Algorithm-with large language model-driven prompt template evolution, fostering a co-evolutionary process where iterative algorithmic refinement and automated prompt generation dynamically interact, potentially maintaining multiple distinct prompt template pools to enhance performance.

This research introduces a co-evolutionary approach using large language models to design and refine swarm intelligence algorithms and their corresponding prompt templates.

While automated algorithm design has progressed through techniques like evolutionary optimization, a critical limitation remains-the neglect of prompt engineering that guides these algorithms, particularly when leveraging the power of large language models. This paper, ‘Beyond Algorithm Evolution: An LLM-Driven Framework for the Co-Evolution of Swarm Intelligence Optimization Algorithms and Prompts’, introduces a novel framework that simultaneously evolves both swarm intelligence algorithms and the prompts that direct them, achieving superior performance on NP-hard problems. Our results demonstrate that this co-evolutionary approach not only surpasses existing automated design methods but also reduces reliance on the most powerful LLMs, offering a path towards more cost-effective solutions. Could this paradigm shift redefine the future of optimization, emphasizing the indispensable synergy between algorithmic innovation and intelligent prompting?


The Inevitable Complexity of “Hard” Problems

A significant class of practical challenges, ranging from scheduling aircraft landings to optimizing the placement of distribution centers – known as the P-Median Problem – are categorized as NP-Hard. This designation doesn’t imply these problems are unsolvable, but rather that the time required to find the absolute best solution increases exponentially with the size of the problem. Consequently, even with powerful computers, determining an optimal solution for moderately sized instances becomes computationally prohibitive; the number of possible combinations to evaluate quickly overwhelms available resources. This characteristic forces practitioners to explore alternative strategies focused on finding “good enough” solutions within a reasonable timeframe, rather than insisting on mathematical perfection.

The inherent difficulty of NP-hard problems lies not in solving small instances, but in their scaling complexity. Traditional optimization algorithms, such as linear programming or exhaustive search, often exhibit performance that degrades exponentially with increasing problem size. This means doubling the number of variables or constraints can dramatically increase computation time, quickly rendering these methods impractical for real-world applications. Consequently, researchers are actively developing innovative approaches-including heuristics, metaheuristics like genetic algorithms and simulated annealing, and approximation algorithms-that prioritize finding acceptable, though not necessarily perfect, solutions within reasonable timeframes. These methods trade off optimality for speed, providing a pragmatic path forward when exhaustive searches become computationally infeasible, and allowing for practical solutions to complex logistical, scheduling, and resource allocation challenges.

Swarm intelligence algorithms-including PSO, FWA, and ACO-provide both directly applicable optimization strategies for NP-hard problems and a source of adaptable heuristics to enhance existing specialized tools.
Swarm intelligence algorithms-including PSO, FWA, and ACO-provide both directly applicable optimization strategies for NP-hard problems and a source of adaptable heuristics to enhance existing specialized tools.

Borrowing Wisdom From the Swarm

Swarm intelligence algorithms represent a computational paradigm inspired by the collective behaviors observed in social insects, such as ants, bees, and termites. These algorithms address complex optimization problems by employing a population of agents that interact locally with each other and their environment. Unlike traditional optimization techniques that may rely on gradient information or exhaustive search, swarm intelligence methods leverage decentralized control and self-organization to explore vast and often high-dimensional solution spaces. The Firework Algorithm is one such example, falling under the broader category of population-based, stochastic optimization techniques designed to efficiently identify optimal or near-optimal solutions without requiring centralized control or detailed problem-specific knowledge. This approach is particularly beneficial when dealing with non-differentiable, noisy, or multi-modal objective functions where conventional methods may struggle.

The Firework Algorithm employs two primary operators to evolve a population of potential solutions. The Explosion operator disperses a firework’s energy – representing a solution – into multiple sparks, creating new candidate solutions in the search space. These sparks are then subject to the Mutation operator, which introduces random changes to their characteristics, allowing for further exploration and adaptation. This iterative process of explosion and mutation simulates the generation of diverse ideas followed by their refinement, driving the algorithm toward optimal solutions by continuously modifying and evaluating the population.

The Selection Operator within the Firework Algorithm is responsible for determining which fireworks, representing potential solutions, are propagated to the next iteration. This operator employs a fitness-based approach, evaluating each firework’s performance against the objective function and ranking them accordingly. A subset of the highest-performing fireworks are then selected for reproduction, while those with lower fitness values are discarded. This process ensures that the algorithm focuses computational resources on exploring areas of the search space that exhibit greater promise, effectively guiding the search toward optimal or near-optimal solutions. The specific method for ranking and selecting fireworks can vary, but generally involves parameters controlling the selection pressure and diversity maintenance.

A large language model streamlines the evolutionary process of the Fireworks Algorithm, as detailed in [cen2025llm].
A large language model streamlines the evolutionary process of the Fireworks Algorithm, as detailed in [cen2025llm].

Letting the Algorithm Evolve Itself

Algorithm-Prompt Co-Evolution represents a developing methodology for integrating Large Language Models (LLMs) into computational optimization processes. This approach moves beyond simply using LLMs as black-box predictors and instead utilizes them as active components within the optimization loop. Specifically, LLMs are employed to generate and assess novel algorithmic structures or components, effectively co-evolving the algorithm itself alongside the prompting strategies used to guide the LLM. This differs from traditional optimization techniques by introducing a learned, adaptable algorithmic element driven by the LLM’s generative capabilities, and has shown promise in tackling complex NP-Hard problems where conventional algorithms struggle.

Prompt Engineering and Prompt Templates are central to directing Large Language Models (LLMs) in the automated generation and assessment of algorithmic components. Prompt Engineering involves the careful design of input prompts to elicit specific code or algorithmic structures from the LLM. Prompt Templates provide a structured framework for these prompts, defining the input format, desired output, and evaluation criteria. This methodology enables the LLM to act as a programmable code generator, iteratively refining algorithmic components based on the feedback provided within the prompts. The LLM evaluates candidate solutions, often utilizing cost functions or predefined success metrics specified in the prompt, and subsequently generates new components designed to optimize performance against those criteria. This iterative process allows for the exploration of a vast solution space without explicit human intervention in the code development.

Simultaneous co-evolution of both the optimization algorithm and the prompting structure yields substantial performance gains when addressing NP-Hard problems. This approach iteratively refines not only the algorithmic components-such as the Firework optimization algorithm-but also the prompts used to guide the Large Language Model (LLM). Testing with GPT-5 demonstrated an average performance improvement of 112.57% across a range of NP-Hard problem instances, indicating a significant increase in solution quality and adaptability compared to traditional optimization methods or systems with static prompting.

Existing LLM-assisted algorithm design frameworks, including EoH, Funsearch, and Reevo, have established the viability of this approach to optimization. However, comparative analysis demonstrates that our framework consistently achieves superior performance. Across multiple problem instances, it yields a greater than 5% improvement over these existing frameworks. This performance advantage indicates enhanced efficiency in algorithm generation and evaluation facilitated by the specific design of our framework, contributing to more effective solutions for complex optimization challenges.

Utilizing GPT-5, the developed framework achieved a 154.34% performance improvement when applied to the Airplane11 test problem. This result was obtained through the co-evolution of both the optimization algorithm and the prompting structure guiding the Large Language Model. Airplane11 is a computationally challenging instance within the class of NP-Hard problems, specifically relating to the optimization of aircraft routing and scheduling. The substantial performance gain demonstrates the framework’s capacity to effectively address complex combinatorial optimization challenges and surpasses the performance observed with other LLM-assisted algorithm design frameworks on this specific problem instance.

The LLM+FWA+P framework iteratively refines prompt templates (red dots) through mutation and crossover (denoted as mim_{i} and cic_{i}), navigating a landscape of local maxima to ultimately achieve optimal performance with m4m_{4} on a given PMU problem instance.
The LLM+FWA+P framework iteratively refines prompt templates (red dots) through mutation and crossover (denoted as mim_{i} and cic_{i}), navigating a landscape of local maxima to ultimately achieve optimal performance with m4m_{4} on a given PMU problem instance.

A Shift in How We Tackle the Intractable

The convergence of large language models (LLMs) and optimization algorithms represents a substantial departure from conventional methods for tackling NP-Hard problems. Historically, these computationally intensive challenges have relied on heuristics – rule-of-thumb strategies offering good, but not necessarily optimal, solutions within a reasonable timeframe. However, integrating LLMs introduces a new dynamic, allowing algorithms to learn and adapt strategies based on reasoning and pattern recognition, rather than pre-defined rules. This shifts the focus from designing specific heuristics to cultivating an algorithmic approach that can independently discover and refine solutions, potentially unlocking performance gains previously unattainable. The result isn’t simply an improvement on existing methods, but a fundamental change in how complex problems are approached, opening possibilities for more robust and efficient solutions across diverse fields.

The synergy between large language models and optimization algorithms extends far beyond theoretical computer science, holding considerable promise for practical applications across diverse sectors. In logistics, particularly within the complex constraints of the Flow Shop Scheduling Problem, LLMs can potentially devise more efficient production sequences, minimizing completion times and resource wastage. Resource allocation, a perennial challenge in fields ranging from finance to healthcare, benefits from the LLM’s ability to analyze intricate dependencies and suggest optimal distribution strategies. Furthermore, the Equitable Partition Problem – ensuring fair division of resources or responsibilities – gains a powerful new tool, as LLMs can navigate the inherent trade-offs and propose solutions that maximize equity while adhering to specified constraints. This broad applicability suggests a future where LLM-assisted optimization becomes integral to solving real-world problems requiring both computational power and nuanced reasoning.

The capacity of large language models (LLMs) extends beyond simple pattern recognition, offering a pathway to genuinely innovative algorithmic design. Rather than relying on pre-programmed heuristics or exhaustive searches, these models demonstrate an ability to reason about problem structures and propose solutions that deviate from conventional approaches. Studies indicate that LLMs can generate code implementing novel optimization strategies for complex challenges, often outperforming existing methods in areas like resource allocation and scheduling. This isn’t merely about faster computation; it suggests the potential for discovering entirely new classes of algorithms, particularly beneficial when tackling NP-Hard problems where traditional techniques falter. The inherent reasoning abilities of LLMs thus present a promising avenue for achieving substantial performance gains and pushing the boundaries of optimization research, potentially unlocking solutions previously considered intractable.

Ongoing investigations are heavily focused on perfecting co-evolutionary techniques, where large language models (LLMs) and optimization algorithms learn and improve in tandem. Researchers are particularly interested in identifying the boundaries of LLM-assisted optimization – determining the types of problems where this synergy yields substantial advantages and where traditional methods remain more effective. This includes exploring different LLM prompting strategies, refining the methods for translating LLM suggestions into executable code, and developing robust evaluation metrics that accurately capture the quality of solutions discovered through this co-evolutionary process. Ultimately, this research aims to move beyond simply demonstrating feasibility and towards a comprehensive understanding of when and how LLMs can fundamentally enhance our ability to tackle complex, computationally challenging problems, potentially leading to breakthroughs in fields ranging from supply chain management to scientific discovery.

The pursuit of automated algorithm design, as detailed in this framework, often feels like chasing a mirage. It’s tempting to believe a perfect optimization strategy can emerge, but reality invariably introduces unforeseen constraints. This paper attempts to address that complexity by co-evolving both algorithms and the prompts that guide them-a sensible approach, if only because production environments rarely conform to neat theoretical models. As Linus Torvalds famously said, “Talk is cheap. Show me the code.” This sentiment perfectly encapsulates the need to move beyond abstract improvements and demonstrate tangible results, especially when dealing with NP-hard problems where elegance often gives way to pragmatic necessity. The framework’s success isn’t about finding the best algorithm, but a functional one that survives contact with the real world.

What’s Next?

The presented co-evolutionary framework, while demonstrating performance gains, merely shifts the locus of future failure. The bug tracker will not remain empty. Currently, the system optimizes for specific problem instances. The inevitable production deployment will reveal that “NP-hard” is not a polite suggestion, but a fundamental constraint. The algorithms, honed on benchmarks, will encounter data distributions that resemble nothing in the training set-and the prompts, so carefully crafted, will become brittle incantations against unforeseen inputs.

The real challenge isn’t building a better optimizer, but accepting that optimization has diminishing returns. The pursuit of ‘general’ swarm intelligence is a siren song. It’s more likely that the field will fragment into highly specialized heuristics, each a fragile adaptation to a narrow domain. The system treats prompts as parameters, but the semantics of language are far more fluid. Future work will likely involve adversarial prompt generation, not to improve the algorithm, but to map the boundaries of its competence-to document precisely where it breaks.

The promise of automated algorithm design is always a reduction in human effort. But effort doesn’t vanish-it’s merely redistributed. The system doesn’t solve problems; it externalizes the cost of failure. The team doesn’t deploy-it lets go.


Original article: https://arxiv.org/pdf/2512.09209.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-12-11 09:41