From Sketch to Simulation: AI Automates Chemical Process Design

Author: Denis Avetisyan


Researchers have developed an artificial intelligence system that automatically translates process flow diagrams into executable simulations, streamlining the design of complex chemical plants.

A multi-agent system architecture provides a framework for coordinating the actions of independent agents, acknowledging that even the most elegantly designed system will ultimately confront the unpredictable realities of a production environment.
A multi-agent system architecture provides a framework for coordinating the actions of independent agents, acknowledging that even the most elegantly designed system will ultimately confront the unpredictable realities of a production environment.

A multi-agent system leveraging large language models automates flowsheet generation within Aspen HYSYS for diverse process simulations.

Converting hand-drawn process diagrams into executable simulations remains a significant bottleneck in chemical engineering, demanding considerable manual effort and specialized software expertise. This work presents ‘Sketch2Simulation: Automating Flowsheet Generation via Multi Agent Large Language Models’, a novel multi-agent system that directly translates process flow diagrams into executable models within Aspen HYSYS. The framework achieves this by decomposing the task into diagram interpretation, model synthesis, and multi-level validation, demonstrating successful model generation across case studies of increasing complexity. Can this approach unlock fully automated process design and optimization workflows, and what further innovations are needed to address the challenges posed by complex, implicit diagram semantics and simulator constraints?


From Diagrams to Digital Twins: The Inevitable Automation

The creation of accurate process simulations, historically reliant on software like Aspen HYSYS, has long been characterized by intensive manual effort. Engineers meticulously build process flow diagrams and then painstakingly translate these into digital representations within the simulation environment, a process demanding significant time and expertise. This reliance on manual input introduces a substantial risk of human error, from incorrect equipment specifications to flawed connections between unit operations. Consequently, even minor mistakes can propagate through the simulation, leading to inaccurate results and potentially costly design flaws. The inherently time-consuming nature of this traditional workflow also limits the number of design iterations that can be explored, hindering optimization and delaying the scale-up of chemical processes.

The reliance on manual construction of process simulations introduces a critical impediment to efficient chemical engineering workflows. Each step – from translating process flow diagrams into software inputs to validating model accuracy – demands considerable time and specialized expertise, creating a bottleneck that slows down design iterations and hinders optimization efforts. This manual process isn’t merely time-consuming; it’s also susceptible to human error, potentially leading to costly mistakes during scale-up and impacting overall process performance. Consequently, the speed at which innovations can be brought to fruition, and the efficiency with which existing processes can be improved, are both significantly constrained by this inherent limitation in traditional methods.

Contemporary chemical processes are no longer simple, linear systems; instead, they represent intricate networks of reactions, separations, and control loops, demanding increasingly sophisticated modeling approaches. This surge in complexity, coupled with intense market pressures for accelerated innovation and reduced time-to-market, fuels the critical need for automation in process simulation. The traditional, largely manual creation of process models-requiring significant engineering time and expertise-simply cannot keep pace with modern demands. Consequently, automated workflows, capable of rapidly generating and optimizing digital twins of chemical processes, are becoming essential for maintaining competitiveness and effectively responding to evolving industry challenges. The ability to swiftly iterate through design options and rapidly scale up production relies heavily on minimizing manual intervention and maximizing the speed and accuracy of simulation.

The HYSYS flowsheet illustrates the process configuration for aromatic compound production.
The HYSYS flowsheet illustrates the process configuration for aromatic compound production.

Decomposing Complexity: A Multi-Agent Approach

A Multi-Agent System (MAS) is employed to address the complexity inherent in automated model generation by dividing the overall task into discrete sub-problems handled by individual, specialized agents. Each agent is designed with a specific competency, such as unit operation modeling, stream property estimation, or equation solving, and operates autonomously while coordinating with other agents through a defined communication protocol. This decomposition enables parallel processing and facilitates the management of intricate process models that would be difficult to handle as a single, monolithic system. The MAS architecture promotes modularity, allowing for the easy addition, removal, or modification of agents to adapt to varying process requirements and modeling needs.

The Diagram Parsing and Interpretation Layer functions as the initial processing stage for input Process Diagrams. This layer employs algorithms to analyze the visual elements of the diagram, identifying and extracting key components such as Unit Operations and Stream Connections. The extracted information is then translated into a structured Intermediate Representation, a standardized data format designed for internal system processing. This representation defines the process topology by explicitly detailing the relationships between unit operations and the data flow established by stream connections, enabling subsequent agents to operate on a consistent and machine-readable format.

The Intermediate Representation explicitly defines process topology by detailing Unit Operations – discrete processing steps within the system – and the Stream Connections that govern material or information flow between them. This structured data format doesn’t merely list components; it establishes the precise relationships defining how these operations are linked. Specifically, it captures which output streams from one Unit Operation serve as input streams for others, creating a network representation of the entire process. This relational data is critical for subsequent automated model generation, enabling the system to understand process flow and dependencies without relying on the original diagram’s visual layout.

The system’s modular architecture facilitates both flexibility and scalability in automated model generation. Individual agents, responsible for specific tasks within the process – such as diagram parsing, unit operation selection, or stream connection validation – can be independently modified or replaced without affecting the overall system functionality. This decoupling allows the system to adapt to diverse process diagram formats and varying levels of complexity, ranging from simple, linear processes to highly intricate, multi-faceted designs. Furthermore, the agent-based design supports scalability by enabling the addition of new agents to handle increased computational demands or to incorporate support for additional process components, ensuring the system can accommodate increasingly complex modeling requirements without significant performance degradation.

The generated flowsheet illustrates a HYSYS simulation of the Merox process, depicting the unit operations and stream connections for sulfur removal.
The generated flowsheet illustrates a HYSYS simulation of the Merox process, depicting the unit operations and stream connections for sulfur removal.

From Representation to Reality: Synthesizing the Simulation

The Simulation Model Synthesis Layer functions as the core component responsible for translating the abstract Intermediate Representation into a concrete, executable process flowsheet within the Aspen HYSYS simulation environment. This layer doesn’t merely copy data; it actively instantiates process units and stream connections as defined in the Intermediate Representation, effectively building the simulation model programmatically. This instantiation process involves creating instances of predefined HYSYS components, configuring their parameters based on the Intermediate Representation’s specifications, and establishing the necessary material and energy flow connections between these components. The layer serves as the bridge between the high-level process description and the simulator’s native format, enabling automated model generation.

The simulation model is constructed through the coordinated operation of three specialized agents: the Basis Agent, the Instantiation Agent, and the Configuration Agent. The Basis Agent initializes the model by defining fundamental parameters and unit operations based on the Intermediate Representation. Following initialization, the Instantiation Agent creates concrete instances of these units and streams within the Aspen HYSYS environment. Finally, the Configuration Agent establishes the necessary connections between units, sets operating parameters, and configures component properties to fully define the process flowsheet. This collaborative approach ensures a systematic and automated translation of the process design into a functional simulation model.

The Normalization Agent functions as a critical intermediary, verifying and adjusting the Intermediate Representation to meet the specific structural requirements of the target simulator, Aspen HYSYS. This process involves validating component lists, ensuring unit operation definitions align with HYSYS capabilities, and resolving any discrepancies in stream definitions or equipment naming conventions. Specifically, the agent standardizes data types, verifies dimensional consistency, and transforms abstract representations into concrete HYSYS objects. Successful normalization is crucial for preventing model instantiation errors and maintaining data integrity throughout the automated simulation workflow.

Automated model synthesis consistently achieves high levels of structural accuracy, as demonstrated by performance metrics across multiple case studies. Specifically, Unit Consistency, which measures the correct instantiation of process units, attains an F1-score of ≄ 0.98. Stream Consistency, evaluating the proper connectivity and data association of material streams, reaches ≄ 0.96, even within the most complex aromatic production case study analyzed. These results indicate a high degree of reliability in translating the intermediate representation into a functional simulation environment, minimizing manual intervention and potential errors.

Connection Consistency, a key metric for evaluating the automated process, assesses the accurate linking of process units and material streams within the generated Aspen HYSYS simulation. In the most complex aromatic production case study, this system maintains Connection Consistency at a level of ≄ 0.93, as measured by F1-score. This indicates a high degree of reliability in establishing correct material and energy flow pathways between components of the process flowsheet, minimizing errors in model topology and ensuring valid simulation results. The achieved score demonstrates the system’s ability to consistently and accurately translate the Intermediate Representation into a fully connected and functional process simulation within the HYSYS environment.

The Execution Agent constitutes the final stage of automated model generation, responsible for initiating the Aspen HYSYS simulation and managing potential runtime errors. Following model instantiation and configuration, this agent submits the completed flowsheet for execution, monitoring the simulation process for convergence failures or other exceptions. Error handling procedures within the agent include automated diagnostic reporting and, where possible, iterative adjustments to input parameters to facilitate successful model execution. Upon completion, or in the event of unrecoverable errors, the agent provides a status report, signaling the conclusion of the automated workflow and delivering the simulation results or error details to the user.

A HYSYS flowsheet was generated to model the crude distillation process.
A HYSYS flowsheet was generated to model the crude distillation process.

From Efficiency to Innovation: The Future of Process Simulation

The creation of accurate process simulations traditionally demands substantial time and effort, often requiring engineers to manually construct and verify complex models. Recent advancements, however, focus on automating this model generation process, dramatically reducing the workload and associated timelines. This automation isn’t simply about speed; it involves intelligent algorithms capable of translating process schematics and data directly into functional simulation models. The result is a streamlined workflow where engineers can rapidly iterate on designs and explore a wider range of operating conditions. Consequently, the time previously spent on tedious model building is now available for higher-level tasks such as process optimization, performance analysis, and innovative design exploration, ultimately accelerating the path from concept to implementation.

By freeing engineers from the constraints of repetitive, manual model building, advanced process simulation tools are fostering a new era of proactive problem-solving and inventive design. The shift allows dedicated professionals to concentrate analytical efforts on identifying areas for process improvement, exploring novel configurations, and ultimately, maximizing efficiency and output. This redirection of focus isn’t simply about saving time; it’s about unlocking human potential, enabling engineers to leverage their expertise in creative problem-solving rather than being burdened by the intricacies of data input and model validation. The result is a faster cycle of innovation, leading to more robust, adaptable, and high-performing processes across diverse industries.

The architecture of this process simulation leverages a multi-agent system designed with inherent modularity, allowing for seamless scalability as designs grow in complexity. Each agent represents a discrete unit or process step, and these agents interact according to predefined rules, effectively building a digital twin of the physical system. This approach differs significantly from monolithic simulation software; instead of rewriting or drastically altering the entire model to accommodate increased detail, additional agents can be readily integrated without disrupting existing functionality. Consequently, simulations can expand from representing a single process train to encompassing an entire facility, or even a network of interconnected plants, with minimal computational overhead and reduced development time. The system’s adaptability ensures that the simulation remains a viable tool for optimization and analysis, even as process designs become increasingly intricate and demanding.

The transition towards automated process simulation demonstrably reduces the incidence of human error, yielding simulations of markedly improved reliability and accuracy. Historically, constructing and validating these models demanded substantial manual input – a process inherently susceptible to inconsistencies and oversights. By minimizing direct human intervention in model creation and data entry, the system substantially lowers the potential for errors that could propagate through the entire simulation. This enhanced fidelity is not merely a matter of precision; it translates directly into more trustworthy results, enabling engineers to make data-driven decisions with greater confidence and ultimately optimize processes more effectively. The result is a shift from simulations burdened by potential inaccuracies to robust, dependable tools for predictive analysis and innovative design.

The pursuit of automated flowsheet generation, as demonstrated by Sketch2Simulation, feels less like innovation and more like accelerating the inevitable. The system successfully translates diagrams into executable models within Aspen HYSYS, a feat of engineering, certainly. However, one anticipates the emergent properties of these automatically generated simulations will reveal unforeseen interactions, edge cases, and ultimately, new avenues for process failure. As Henri PoincarĂ© observed, “It is through science that we arrive at truth, but it is through chaos that we get there.” This paper tackles the ‘science’ part, but the ‘chaos’-the production environment’s relentless capacity to expose the limits of any model-remains. The bug tracker will inevitably fill. They don’t deploy-they let go.

The Road Ahead

The automation of flowsheet generation, as demonstrated, merely shifts the burden of model inaccuracy. The system successfully translates diagrams into executable code, but the fidelity of that execution remains tethered to the assumptions embedded within both the diagram and the underlying solver. Expect a proliferation of increasingly sophisticated diagram validation techniques – a new layer of pre-processing to manage the inevitable divergence between representation and reality. The current work addresses model creation; the real challenge, predictably, will be model maintenance.

The multi-agent architecture, while elegant in concept, introduces its own complexities. Each agent represents a potential point of failure, a new surface for unexpected interactions. The field will likely witness a push towards simplification, a realization that more agents do not necessarily equate to more robust systems. The pursuit of ‘intelligent’ agents will invariably circle back to the need for rigorous, deterministic logic – the illusion of autonomy is expensive to sustain.

Ultimately, this work is another step in the ongoing effort to automate what was once considered expertise. The long-term outcome is not the elimination of chemical engineers, but a recalibration of their skillset. The focus will shift from manual model building to the curation of training data and the interpretation of simulation results – from crafting the map to navigating the territory. The problem isn’t a lack of tools; it’s a surplus of abstractions.


Original article: https://arxiv.org/pdf/2603.24629.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-03-29 08:23