AI Chemist: Automating the Search for Better Catalysts

Author: Denis Avetisyan

A new AI agent streamlines the process of discovering and optimizing materials that accelerate chemical reactions, promising a faster path to sustainable and efficient chemistry.

This work presents Catalyst-Agent, an autonomous system leveraging machine learning and large language models to screen and optimize heterogeneous catalyst materials using density functional theory calculations.

The rational design of heterogeneous catalysts remains a significant bottleneck despite decades of research relying on expensive experimentation or computationally intensive first-principles methods. This work introduces ‘Catalyst-Agent: Autonomous heterogeneous catalyst screening and optimization with an LLM Agent’, a novel AI agent that leverages machine learning, specifically graph neural networks for adsorption energy calculations, and a large language model to autonomously explore material databases and refine promising candidates. Catalyst-Agent achieves a 23-34% success rate in identifying viable catalysts across reactions including oxygen, nitrogen, and carbon dioxide reduction, converging on solutions in just 1-2 iterations. Could this approach usher in a new era of accelerated materials discovery with minimal human intervention, fundamentally reshaping the landscape of catalytic design?

The Burden of Discovery: A Catalyst’s Challenge

The pursuit of sustainable energy solutions is significantly hampered by the protracted and costly nature of traditional catalyst discovery. Historically, identifying effective catalysts-substances that accelerate chemical reactions-has relied on exhaustive trial-and-error experimentation, often involving synthesizing and testing countless materials. This process demands substantial resources, both in terms of time and financial investment, and frequently yields limited progress. The sheer number of potential catalytic materials, coupled with the complex interplay of factors governing their performance, creates a vast search space that conventional methods struggle to navigate efficiently. Consequently, breakthroughs in crucial areas like carbon dioxide reduction, nitrogen fixation, and oxygen evolution-all vital for a cleaner energy future-are considerably delayed by these limitations in materials discovery.

The pursuit of efficient catalysts for reactions involving carbon dioxide, nitrogen, and oxygen represents a critical frontier in addressing global climate change and burgeoning energy demands. Transforming atmospheric carbon dioxide into usable fuels, or efficiently converting nitrogen into ammonia for sustainable agriculture, and optimizing oxygen reduction for advanced energy storage – all hinge on materials that can accelerate these processes with minimal energy input. Current industrial processes for these reactions are often energy-intensive and rely on scarce or environmentally problematic materials. Consequently, the development of novel catalysts capable of performing these transformations at scale, under mild conditions, and with earth-abundant elements, promises not only a reduction in greenhouse gas emissions but also a pathway towards a more sustainable and secure energy future. The limitations of existing catalytic materials underscore the urgent need for innovative approaches and the exploration of previously unconsidered chemical compositions and structures.

Predicting catalytic performance remains a significant challenge for computational materials science, largely due to the intricate interplay of factors at play during a chemical reaction. Existing methods, while valuable, often simplify these complexities, struggling to accurately model the dynamic electronic structure, surface phenomena, and adsorbate interactions critical for catalysis. These limitations stem from the sheer number of degrees of freedom involved – including the composition, structure, and environment of the catalyst – coupled with the approximations inherent in density functional theory and other computational techniques. Consequently, in silico predictions frequently diverge from experimental results, necessitating extensive and costly trial-and-error experimentation. This disconnect underscores the need for more sophisticated approaches capable of capturing the nuanced behavior of catalytic systems and guiding the rational design of high-performance materials.

Catalyst-Agent represents a significant departure from conventional catalyst discovery methods, functioning as a fully autonomous AI agent capable of navigating the complex landscape of materials science. This system integrates automated workflows, encompassing computational modeling, data analysis, and experimental design, to iteratively refine and optimize potential catalytic materials. Unlike traditional approaches reliant on human intuition and exhaustive screening, Catalyst-Agent employs reasoning algorithms to intelligently propose, evaluate, and select promising candidates, significantly accelerating the discovery process. The agent’s capacity for self-directed learning and adaptation allows it to explore a vast chemical space with unprecedented efficiency, potentially unlocking novel catalysts for crucial reactions like carbon dioxide reduction and nitrogen fixation, and offering a pathway toward more sustainable energy technologies.

Intelligent Exploration: The Agent’s Design

Catalyst-Agent employs the GPT-5.2 large language model to automate key stages of catalyst discovery, including experimental plan generation, task orchestration, and logical reasoning. GPT-5.2 functions as the central control system, interpreting objectives and formulating a sequence of actions to achieve desired outcomes. This includes determining appropriate computational methods, selecting relevant materials from databases, and analyzing simulation results. The model’s reasoning capabilities enable it to adapt the workflow based on intermediate findings, effectively navigating the complex search space for novel catalyst materials and optimizing the discovery process without direct human intervention.

Catalyst-Agent employs the Model Context Protocol (MCP) to facilitate communication with independent tool servers, each designed for specific tasks such as materials database querying or computational analysis. The MCP defines a standardized interface for exchanging information – including prompts, results, and contextual data – between the large language model and these servers. This architecture promotes modularity by allowing individual tool servers to be updated or replaced without affecting the core agent logic. Scalability is achieved through the ability to easily add new tool servers to expand the agent’s capabilities and to distribute computational workloads across multiple resources, enhancing overall processing efficiency and throughput.

Catalyst-Agent utilizes the standardized Open Platform for Materials Data (OPTIMADE) Application Programming Interface (API) to access and query data from extensive materials databases, specifically the Materials Project and the Open Quantum Materials Database (OQMD). This connection allows the agent to programmatically retrieve information on a diverse range of materials, including their structural properties, electronic characteristics, and calculated energies. The OPTIMADE API ensures interoperability and facilitates the automated retrieval of millions of candidate catalyst structures, significantly broadening the scope of initial screening beyond what is feasible with manual methods.

Automated workflows within Catalyst-Agent substantially decrease the time and resource expenditure associated with initial catalyst screening. Traditionally, identifying potential catalyst candidates involved manual literature review, data curation, and computationally expensive calculations for a limited number of materials. Catalyst-Agent, through its integration with materials databases like Materials Project and OQMD via the OPTIMADE API, enables the rapid assessment of a vastly expanded chemical space. This automated process eliminates manual data handling, accelerates the computational screening of potential candidates, and prioritizes structures for further, more detailed analysis, ultimately reducing the overall time-to-discovery and associated research costs.

Adsorption Energies: A Predictive Foundation

Catalyst-Agent employs adsorption energy – the energy released when a reactant binds to a catalyst surface – as a primary descriptor for evaluating catalytic performance. This metric provides a quantitative assessment of the strength of interaction between reactants and the catalyst, directly correlating with reaction rates and overall reactivity. Specifically, the approach leverages the principle that optimal catalytic activity arises when reactant adsorption is neither too weak – hindering reaction initiation – nor too strong – inhibiting product desorption. By accurately calculating adsorption energies, Catalyst-Agent can predict the relative performance of different catalyst materials and prioritize those exhibiting adsorption characteristics conducive to efficient chemical transformations.

The AdsorbML workflow calculates adsorption energies by integrating machine learning potentials (MLPs) with periodic slab calculations. This approach constructs surface models using crystallographic slabs, representing the catalyst material’s surface. MLPs, trained on density functional theory (DFT) data, are then used to efficiently predict the energies of adsorbate molecules on these slabs, circumventing the computational cost of traditional DFT calculations for each adsorption configuration. By leveraging the speed of MLPs while maintaining DFT-level accuracy through the training data, AdsorbML enables high-throughput screening of adsorption energies for diverse materials and adsorption sites, providing a practical means to assess catalytic activity.

The Catalyst-Agent methodology enables high-throughput virtual screening of potential catalyst materials by computationally determining the adsorption energies of key intermediate species. This process circumvents the need for extensive and time-consuming experimental synthesis and characterization. By calculating adsorption energies using the AdsorbML workflow, the system can efficiently evaluate a large number of candidate materials and prioritize those exhibiting favorable binding energies – typically, energies that are neither too strong nor too weak – for desired reactants and intermediates. This rapid assessment allows researchers to focus experimental efforts on a reduced set of promising materials, accelerating the discovery of novel and effective catalysts.

Catalyst-Agent’s predictive capability was evaluated through testing on three electrocatalytic reactions: oxygen reduction reaction (ORR), nitrogen reduction reaction (NRR), and carbon dioxide reduction reaction (CO2 RR). Across these reactions, Catalyst-Agent successfully identified effective catalysts with a success rate ranging from 23 to 34 percent. This indicates the method’s ability to accurately predict catalyst performance using adsorption energies as a key descriptor, despite the complexity and diversity of these reactions and the challenges inherent in computational catalyst design.

Surface Engineering: Precision and Refinement

Catalyst-Agent employs sophisticated surface modification techniques – specifically, top-layer substitution and the application of mechanical strain – to precisely tailor the characteristics of potential catalytic materials. These aren’t random alterations; instead, the system intelligently manipulates the outermost atomic layers of a catalyst, swapping in different elements or inducing stress within the material’s structure. Such targeted adjustments have a profound impact on the electronic properties of the catalytic surface, effectively ‘tuning’ its ability to bind with reactants and accelerate chemical reactions. By strategically altering the surface, Catalyst-Agent can optimize a material’s performance, fostering enhanced activity and selectivity for desired chemical transformations.

Catalyst modification hinges on the ability to sculpt the electronic landscape of a material’s surface, directly influencing how reactants interact and transform. By strategically altering the arrangement and energy levels of surface atoms, researchers can enhance a catalyst’s affinity for specific molecules, lowering activation energies and accelerating desired reactions. This precise control extends beyond simply increasing reaction speed; it also allows for the optimization of selectivity – the preference for one reaction pathway over another – minimizing unwanted byproducts and maximizing the yield of valuable compounds. This level of surface engineering enables the creation of catalysts tailored to specific chemical transformations, unlocking efficiencies unattainable with conventional materials.

Catalyst-Agent demonstrates a remarkable capacity to enhance catalytic performance through strategically implemented surface modifications. The system doesn’t simply iterate through random changes; instead, it intelligently applies techniques like top-layer substitution and strain application to fine-tune the catalyst’s electronic structure and reactivity. This targeted approach consistently yields materials with demonstrably improved activity and selectivity – meaning they not only speed up desired chemical reactions, but also minimize the production of unwanted byproducts. The resulting catalysts frequently outperform their unmodified counterparts, representing a significant advancement in materials discovery and offering pathways to more efficient and sustainable chemical processes.

The Catalyst-Agent system showcases remarkable efficiency in materials discovery, consistently identifying successful catalyst formulations within a mere one to two iterative trials. This rapid convergence highlights the power of its autonomous optimization process, which intelligently navigates the vast chemical space of potential surface modifications. Unlike traditional, often laborious, catalyst development, Catalyst-Agent’s streamlined approach drastically reduces the time and resources needed to achieve high-performing materials, suggesting a paradigm shift in the field of catalytic chemistry and a scalable path towards customized catalysts for diverse applications.

The pursuit of efficient catalyst design, as demonstrated by Catalyst-Agent, echoes a fundamental principle of elegant problem-solving. The system’s autonomous screening and optimization-leveraging machine learning and computational tools-strips away unnecessary complexity in material discovery. This aligns with Andrey Kolmogorov’s assertion: “The shortest path between two truths runs through a universe of possibilities.” Catalyst-Agent doesn’t exhaustively explore every material; instead, it intelligently navigates the vast chemical space, focusing on the most promising candidates. The reduction of computational cost and human intervention showcases how paring back extraneous variables reveals the core principles governing catalytic activity, a process of refinement where clarity emerges from complexity.

Where to Next?

The demonstrated autonomy, while promising, merely highlights the volume of assumptions baked into the current paradigm. Catalyst-Agent, like all such systems, operates within a pre-defined search space-a landscape sculpted by human intuition and computational tractability. The true challenge isn’t accelerating exploration within known territory, but expanding the map itself. Future iterations must address the implicit biases of the training data and, more fundamentally, develop methods for venturing beyond the confines of established chemical intuition.

A persistent limitation remains the translation of computational predictions to real-world performance. Density Functional Theory, for all its utility, is an approximation. The agent’s ability to optimize for a theoretical metric does not guarantee success in a messy, heterogeneous environment. The next logical step involves tighter feedback loops, integrating experimental data with greater efficiency – not as validation, but as a guiding force for recalibrating the agent’s internal models.

Ultimately, the pursuit of autonomous catalyst design isn’t about replacing chemists, but augmenting their capabilities. The system functions best when it’s asked to solve well-defined problems, problems stripped of the ambiguity that fuels human creativity. The real frontier lies in developing agents capable of framing the right questions – a task that demands a level of abstraction currently beyond the reach of even the most sophisticated algorithms.

Original article: https://arxiv.org/pdf/2603.01311.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Burden of Discovery: A Catalyst’s Challenge

Intelligent Exploration: The Agent’s Design

Adsorption Energies: A Predictive Foundation

Surface Engineering: Precision and Refinement

Where to Next?

See also: