AI Agents Take on Materials Science

Author: Denis Avetisyan


A new infrastructure empowers artificial intelligence to autonomously design and execute complex computational experiments, accelerating materials discovery and chemical research.

AtomisticSkills provides a composable library of agentic research skills-spanning materials science, chemistry, drug discovery, and machine learning-that function as modular building blocks for constructing complete research workflows.
AtomisticSkills provides a composable library of agentic research skills-spanning materials science, chemistry, drug discovery, and machine learning-that function as modular building blocks for constructing complete research workflows.

This paper introduces AtomisticSkills, an open-source agentic research infrastructure for building reliable and reproducible scientific workflows using modular skills and AI-driven automation.

Despite advances in computational materials science and chemistry, scaling autonomous research capabilities remains challenging due to fractured software ecosystems and workflow complexity. This work introduces ‘Harnessing AtomisticSkills for Agentic Atomistic Research’-an open-source framework that empowers AI agents to conduct complex atomistic research through modular, extensible skills integrating diverse tools like density functional theory and machine learning interatomic potentials. By hierarchically decomposing scientific workflows, AtomisticSkills facilitates reliable and reproducible research campaigns-from materials discovery to drug design-and demonstrates robust orchestration across multiple scientific domains. Could this agentic infrastructure represent a crucial step towards fully autonomous AI scientists capable of accelerating scientific innovation?


Unveiling Systemic Patterns: The Bottleneck in Atomistic Research

Atomistic simulations, while powerful tools for understanding material behavior at the most fundamental level, present a substantial computational burden. Each simulation, meticulously modeling the interactions of individual atoms, demands considerable processing time and resources, particularly when exploring complex systems or extended timescales. This expense isn’t solely tied to processing power; significant manual effort is also required to design the simulation workflow itself. Researchers must carefully define parameters, select appropriate force fields, and validate results – a process that can be both time-consuming and require specialized expertise. The inherent limitations in efficiently scaling these simulations hinder the ability to rapidly screen vast chemical spaces or accurately predict the properties of novel materials, creating a bottleneck in modern materials discovery and demanding innovative approaches to accelerate the research process.

The challenge of discovering novel materials is fundamentally limited by the sheer size of the chemical space – the countless possible combinations of elements and structures. Traditional computational methods, while accurate for predicting the properties of known compounds, struggle to efficiently navigate this vast landscape. A brute-force approach, testing every conceivable material, is computationally infeasible, while relying solely on intuition or human expertise often overlooks promising candidates. Consequently, materials discovery remains a slow and resource-intensive process. Researchers are actively developing methods – including machine learning potentials and active learning algorithms – to intelligently sample chemical space, prioritizing the investigation of materials most likely to exhibit desired characteristics and thereby accelerating the pace of innovation.

The accelerating pace of materials science and chemical discovery is rapidly outpacing the capabilities of traditional, manually-driven research methods. A critical need now exists for automated frameworks capable of independently executing complex workflows, from initial hypothesis generation to data analysis and model refinement. These systems must not only handle the computational demands of atomistic simulations, but also exhibit adaptability, allowing them to dynamically adjust to new experimental data or shifting research priorities. Scalability is equally vital; frameworks must efficiently utilize high-performance computing resources to explore vast chemical spaces and accelerate the identification of promising materials – a task simply impossible with current, largely sequential, approaches. Ultimately, the future of atomistic research hinges on developing these self-optimizing, adaptable, and scalable systems to unlock the full potential of computational materials discovery.

Atomistic research workflows, traditionally designed as linear sequences of steps, often prove inflexible when confronted with unexpected results or shifts in investigative focus. This rigidity stems from the difficulty of integrating new data streams or modifying established parameters mid-simulation without restarting the entire process-a significant impediment to iterative exploration. Consequently, researchers frequently encounter bottlenecks when attempting to adapt their approaches based on preliminary findings, hindering the potential for serendipitous discoveries and delaying the optimization of materials properties. A truly efficient research paradigm necessitates dynamic workflows capable of autonomously adjusting to incoming information, allowing for real-time refinement of simulations and accelerating the pace of materials innovation by embracing an adaptive, rather than a static, methodology.

AtomisticSkills establishes a hierarchical research infrastructure-comprising a skill library, research standards, and the Model Context Protocol toolbox-that enables general-purpose coding agents to autonomously execute complex atomistic research workflows under human supervision.
AtomisticSkills establishes a hierarchical research infrastructure-comprising a skill library, research standards, and the Model Context Protocol toolbox-that enables general-purpose coding agents to autonomously execute complex atomistic research workflows under human supervision.

Agentic Systems: A Pathway to Accelerated Discovery

AtomisticSkills is an open-source infrastructure designed to automate and accelerate complex atomistic simulations. The system utilizes an agentic approach, employing software agents to perform tasks typically requiring manual scripting and execution by researchers. This framework allows for the autonomous execution of simulation workflows, from initial setup and parameterization to analysis and result extraction. By providing a flexible and extensible platform, AtomisticSkills aims to reduce the time and effort associated with computationally intensive research in materials science, chemistry, and related fields. The codebase and associated tools are publicly available to facilitate community contributions and customization.

AtomisticSkills enhances the capabilities of general-purpose coding agents by integrating domain-specific knowledge from computational materials science, chemistry, and drug discovery. This is achieved through the implementation of specialized tools and workflows that guide the agents in performing complex atomistic simulations. By layering this knowledge, the framework moves beyond basic code execution, enabling agents to autonomously handle tasks such as simulation setup, data analysis, and result interpretation. This approach facilitates adaptability; the system can be readily extended to incorporate new skills and methodologies without requiring extensive reprogramming of the underlying agents, creating a research environment that evolves with the field.

Agentic Atomistic Research enables researchers to specify desired outcomes at a high level, abstracting away the need for detailed scripting of individual simulation steps. The system then autonomously manages the complexities of the workflow, including task decomposition, code execution, data analysis, and iterative refinement. This approach allows for automated execution of simulations, handling dependencies between steps and adapting to unexpected results without manual intervention. Researchers retain control by defining the overarching objective and relevant parameters, while the agentic framework handles the logistical and computational details, significantly accelerating the research process.

Analysis of 500 articles across three scientific disciplines – computational materials science, chemistry, and drug discovery – indicates the current framework’s ability to autonomously execute approximately 56.2% of the skills reported in materials science publications, 44.9% of those in chemistry articles, and 62.4% of skills utilized in drug discovery research. This coverage assessment is based on a systematic identification of distinct computational tasks and workflows described within the analyzed literature, quantifying the proportion of these tasks presently addressable by the agentic system. The data suggests a strong potential for automation in drug discovery, alongside notable coverage within the materials science domain, with continued development focused on expanding capabilities in chemistry.

Analysis of 500 computational drug discovery articles reveals that AtomisticSkills cover, on average, 62.4% of utilized skills and fully address approximately 14% of the papers, demonstrating substantial coverage of commonly employed scientific techniques.
Analysis of 500 computational drug discovery articles reveals that AtomisticSkills cover, on average, 62.4% of utilized skills and fully address approximately 14% of the papers, demonstrating substantial coverage of commonly employed scientific techniques.

Orchestrating Simulations: Intelligent Agents at Work

The simulation framework employs Large Language Model (LLM)-Driven Orchestrators and Knowledge Retrieval Agents to automate the management of complex in-silico simulation pipelines. These agents function by interpreting research objectives, identifying appropriate simulation methodologies, and constructing workflows from available computational tools. Knowledge Retrieval Agents access and process relevant scientific literature and databases to inform simulation parameterization and validation. The LLM-Driven Orchestrators then dynamically schedule and execute simulations, monitor progress, and adapt the workflow based on intermediate results, effectively automating tasks previously requiring significant manual intervention from researchers.

The system’s intelligent agents operate with autonomy by executing defined simulation protocols, processing resulting data using embedded analytical tools, and dynamically adjusting the simulation workflow. This adaptation is achieved through algorithms that evaluate simulation outcomes against pre-defined research goals and iteratively modify parameters such as material compositions, simulation durations, or applied forces. The agents utilize the framework’s integrated high-throughput capabilities to launch new simulations based on these adjusted parameters, creating a closed-loop process for efficient exploration of the simulation space and accelerated discovery. This autonomous operation minimizes the need for manual intervention, enabling large-scale simulations and rapid iteration on research hypotheses.

The system incorporates high-throughput workflow frameworks to facilitate the automated execution of numerous simulations, thereby enabling large-scale materials screening and data generation. These frameworks manage the sequential and parallel execution of computational tasks, including input generation, simulation execution using codes like [latex]VASP[/latex] or [latex]LAMMPS[/latex], and post-processing of results. This automation significantly reduces manual intervention and accelerates the discovery of materials with desired properties. Data generated through these workflows is standardized and readily available for subsequent analysis, machine learning model training, and validation of theoretical predictions.

Simulation efficiency and accuracy are improved through the implementation of Machine Learning Interatomic Potentials (MLIP) and Automated Machine Learning (AutoML). Performance analysis indicates that the AtomisticSkills framework currently addresses approximately 15% of published research in computational materials science, 8% in computational chemistry, and 14% in computational drug discovery. This coverage represents the proportion of articles where the framework can autonomously execute simulations and analysis without requiring manual intervention or modification of the workflow, demonstrating its capacity to automate significant portions of current research in these fields.

An agentic workflow efficiently screens materials for [latex]CO_2[/latex] capture and develops machine learning interatomic potentials (MLIPs) by autonomously composing computational tasks, including materials database queries, grand canonical Monte Carlo (GCMC) simulations, density functional theory (DFT) calculations, and model benchmarking, as demonstrated by identifying promising MOF candidates and fine-tuning the MACE-OMAT-0-small potential on the [latex]Li_{10}GeP_2S_{12}[/latex] dataset.
An agentic workflow efficiently screens materials for [latex]CO_2[/latex] capture and develops machine learning interatomic potentials (MLIPs) by autonomously composing computational tasks, including materials database queries, grand canonical Monte Carlo (GCMC) simulations, density functional theory (DFT) calculations, and model benchmarking, as demonstrated by identifying promising MOF candidates and fine-tuning the MACE-OMAT-0-small potential on the [latex]Li_{10}GeP_2S_{12}[/latex] dataset.

The Foundation of Extensibility: Harnessing Domain Knowledge

The system’s architecture fundamentally relies on ‘Externalization’, a design principle that separates core reasoning abilities from specialized knowledge. Rather than embedding expertise directly into the agent, the framework utilizes broadly capable coding agents as a foundational layer – a versatile substrate for intelligence. Domain-specific skills aren’t built in, but rather added as modular extensions, akin to plugins or add-ons. This approach fosters remarkable adaptability; new capabilities, whether advancements in scientific methodology or specific data analysis techniques, can be integrated without requiring extensive modification of the core system. The result is a highly extensible framework poised to incorporate evolving research needs and maintain long-term relevance, effectively decoupling intelligence from the limitations of pre-programmed expertise.

The framework’s design prioritizes seamless scalability through modularity, enabling the incorporation of novel skills and techniques without requiring substantial architectural overhauls. This adaptability stems from a layered approach, where core functionalities are distinct from domain-specific extensions; as research progresses and new methodologies emerge, these extensions can be readily integrated, updated, or replaced. Consequently, the system avoids the rigidity often found in monolithic architectures, instead fostering a dynamic environment where the framework’s capabilities evolve in concert with the demands of ongoing investigations. This ensures long-term relevance and reduces the risk of obsolescence, positioning the system as a robust platform for sustained scientific exploration.

The architecture fundamentally relies on a distinction between ‘Inner’ and ‘Outer’ Harnesses to facilitate robust intelligent agent operation. The ‘Inner Harness’ represents the core computational machinery – the foundational algorithms for perception, planning, and action – providing a generalized framework for problem-solving. Complementing this is the ‘Outer Harness’, which encapsulates domain-specific procedural knowledge; this isn’t about what the agent knows, but how it applies its core capabilities to a specific task or environment. This separation allows for modularity; the ‘Inner Harness’ remains constant while ‘Outer Harnesses’ can be swapped or modified to adapt the agent to new challenges without requiring a complete overhaul of the underlying system. Consequently, the agent’s intelligence isn’t monolithic, but rather a combination of general competence and specialized expertise, enabling flexible and efficient performance across a range of applications.

The system’s reasoning prowess is significantly amplified through the implementation of retrieval-augmented generation (RAG) techniques. Rather than relying solely on its pre-trained knowledge, the framework actively accesses and incorporates information from external knowledge sources during the generation process. This dynamic integration allows the system to ground its responses in current, relevant data, improving accuracy and reducing the potential for hallucination. By retrieving pertinent information – be it scientific literature, databases, or real-time data – and seamlessly weaving it into its outputs, the system demonstrates a sophisticated ability to reason with, and synthesize, a far broader knowledge base than would otherwise be possible. This approach not only enhances the reliability of its conclusions but also fosters adaptability, allowing it to address novel challenges and incorporate emerging research findings with greater efficiency.

An agentic workflow successfully analyzes experimental XRD data to identify material compositions, recommend synthesis recipes, and screen Fe-oxide surfaces for oxygen evolution reaction (OER) activity, predicting OER overpotential based on binding energy descriptors [latex] \Delta G\_{\rm O\<i>}-\Delta G\_{\rm OH\</i>}[/latex] and aligning with established theoretical limits.
An agentic workflow successfully analyzes experimental XRD data to identify material compositions, recommend synthesis recipes, and screen Fe-oxide surfaces for oxygen evolution reaction (OER) activity, predicting OER overpotential based on binding energy descriptors [latex] \Delta G\_{\rm O\}-\Delta G\_{\rm OH\}[/latex] and aligning with established theoretical limits.

Towards Self-Driving Materials Discovery

The convergence of artificial intelligence and robotics is giving rise to Hardware-Integrated Autonomous Agents, systems poised to revolutionize materials discovery by establishing a closed loop between computational design and physical experimentation. These agents don’t merely propose materials; they actively direct robotic platforms to synthesize and characterize them, analyzing the results and iteratively refining their designs without human intervention. This continuous cycle of proposal, execution, and analysis dramatically accelerates the prototyping process, allowing for the rapid validation – or rejection – of hypotheses and the efficient exploration of vast chemical spaces. By automating the traditionally slow and laborious process of materials research, these systems promise to unlock new materials with tailored properties at an unprecedented rate, moving the field closer to self-driving discovery.

Recent advancements in materials discovery leverage agentic Integrated Development Environments (IDEs), notably Google Antigravity and OpenClaw, which function as intelligent interfaces within the AtomisticSkills framework. These IDEs aren’t merely coding tools; they enable autonomous agents to directly manipulate and control experimental workflows. By seamlessly integrating with AtomisticSkills – a library of pre-built, modular robotic skills for materials science – these agents can independently design experiments, analyze data, and refine their approach. This integration allows for a closed-loop system where computational planning is directly linked to physical action, dramatically accelerating the pace of materials innovation by automating repetitive tasks and enabling rapid, iterative testing of hypotheses. The result is a paradigm shift, moving materials science closer to a future of self-driving discovery where complex materials are designed and created with minimal human intervention.

The core strength of this integrated system lies in its ability to continuously refine materials through iterative experimentation. By seamlessly connecting computational design with physical synthesis and characterization, the closed-loop process allows for rapid assessment of material properties and subsequent adjustments to the design parameters. This isn’t simply about automating existing methods; it’s about establishing a cycle of learning where each experiment informs the next, progressively honing in on desired material characteristics. Consequently, the discovery of novel compounds with tailored properties is significantly accelerated, bypassing the traditionally slow and often serendipitous nature of materials research. The system effectively functions as a self-improving engine, optimizing material performance through a continuous feedback loop and enabling the exploration of vast compositional spaces previously considered intractable.

Materials science is poised for a revolution driven by fully autonomous research systems. These systems transcend traditional computational methods by integrating design, synthesis, and characterization into a closed loop, effectively creating a self-driving laboratory. Rather than relying on human-guided experimentation, these agents independently formulate hypotheses, direct robotic synthesis of candidate materials, and analyze the resulting structures and properties. This iterative process, powered by machine learning and advanced robotics, promises to dramatically accelerate the pace of materials discovery, potentially unlocking novel compounds with tailored properties for applications ranging from energy storage to advanced manufacturing. The vision extends beyond simply automating existing workflows; it anticipates systems capable of posing original research questions and creatively exploring the vast chemical space with minimal human intervention, fundamentally reshaping how materials are invented and optimized.

Screenshots showcase the application of AtomisticSkills within the Google Antigravity environment, demonstrating its functionality and integration.
Screenshots showcase the application of AtomisticSkills within the Google Antigravity environment, demonstrating its functionality and integration.

The development of AtomisticSkills hinges on a systematic approach to scientific inquiry, mirroring the core tenets of rigorous experimentation. Each modular skill represents a discrete investigation, and their orchestration within an agentic workflow demands careful consideration of dependencies and feedback loops. As Wilhelm Röntgen observed, “I have made a discovery which will revolutionize medical diagnostics.” Similarly, this infrastructure aims to revolutionize materials discovery by automating complex workflows and fostering reproducibility. The power lies not merely in generating data, but in interpreting the structural dependencies revealed through these atomistic simulations-a principle echoing Röntgen’s pursuit of understanding hidden phenomena.

Where the Atoms Lead

The architecture detailed within anticipates, perhaps ironically, a future where the limitations aren’t computational power, but the articulation of scientific intuition. While current efforts focus on expanding the ‘skill’ library and refining agentic workflows, the true challenge lies in accommodating – even embracing – the unexpected. Every deviation from predicted outcomes, every outlier in a dataset, represents not a failure of the system, but an opportunity to uncover hidden dependencies previously obscured by theoretical bias. The current framework provides tools to manage error, but future iterations must be designed to actively seek it.

A critical path forward involves a more nuanced integration of uncertainty quantification. Atomistic simulations, by their very nature, operate within probabilistic landscapes. Simply achieving a ‘converged’ result is insufficient; the system must be capable of expressing the confidence in that result, and, crucially, identifying the sources of that uncertainty. This necessitates a shift from deterministic workflows to probabilistic ones, demanding new methods for reasoning under incomplete or contradictory evidence.

Ultimately, the success of such endeavors will not be measured by the speed of discovery, but by the richness of the questions asked. The system presented here is not intended to replace the scientist, but to augment their capacity for imaginative exploration, providing a platform for testing hypotheses that would otherwise remain confined to the realm of thought experiments. The true frontier lies not in automating the known, but in systematically probing the unknown.


Original article: https://arxiv.org/pdf/2605.24002.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-05-26 08:21