Deep Reinforcement Learning-Based Self-Optimization of Flow Chemistry

IF 5.1 Q2 ENGINEERING, CHEMICAL

ACS Engineering Au Pub Date : 2025-05-13 DOI:10.1021/acsengineeringau.5c0000410.1021/acsengineeringau.5c00004

Ashish Yewale, Yihui Yang, Neda Nazemifard, Charles D. Papageorgiou, Chris D. Rielly and Brahim Benyahia*,

{"title":"Deep Reinforcement Learning-Based Self-Optimization of Flow Chemistry","authors":"Ashish Yewale, Yihui Yang, Neda Nazemifard, Charles D. Papageorgiou, Chris D. Rielly and Brahim Benyahia*, ","doi":"10.1021/acsengineeringau.5c0000410.1021/acsengineeringau.5c00004","DOIUrl":null,"url":null,"abstract":"<p >The development of effective synthetic pathways is critical in many industrial sectors. The growing adoption of flow chemistry has opened new opportunities for more cost-effective and environmentally friendly manufacturing technologies. However, the development of effective flow chemistry processes is still hampered by labor- and experiment-intensive methodologies and poor or suboptimal performance. In this context, integrating advanced machine learning strategies into chemical process optimization can significantly reduce experimental burdens and enhance overall efficiency. This paper demonstrates the capabilities of deep reinforcement learning (DRL) as an effective self-optimization strategy for imine synthesis in flow, a key building block in many compounds such as pharmaceuticals and heterocyclic products. A deep deterministic policy gradient (DDPG) agent was designed to iteratively interact with the environment, the flow reactor, and learn how to deliver optimal operating conditions. A mathematical model of the reactor was developed based on new experimental data to train the agent and evaluate alternative self-optimization strategies. To optimize the DDPG agent’s training performance, different hyperparameter tuning methods were investigated and compared, including trial-and-error and Bayesian optimization. Most importantly, a novel adaptive dynamic hyperparameter tuning was implemented to further enhance the training performance and optimization outcome of the agent. The performance of the proposed DRL strategy was compared against state-of-the-art gradient-free methods, namely SnobFit and Nelder–Mead. Finally, the outcomes of the different self-optimization strategies were tested experimentally. It was shown that the proposed DDPG agent has superior performance compared to its self-optimization counterparts. It offered better tracking of the global solution and reduced the number of required experiments by approximately 50 and 75% compared to Nelder–Mead and SnobFit, respectively. These findings hold significant promise for the chemical engineering community, offering a robust, efficient, and sustainable approach to optimizing flow chemistry processes and paving the way for broader integration of data-driven methods in process design and operation.</p>","PeriodicalId":29804,"journal":{"name":"ACS Engineering Au","volume":"5 3","pages":"247–266 247–266"},"PeriodicalIF":5.1000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acsengineeringau.5c00004","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Engineering Au","FirstCategoryId":"1085","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsengineeringau.5c00004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}

引用次数: 0

Abstract

The development of effective synthetic pathways is critical in many industrial sectors. The growing adoption of flow chemistry has opened new opportunities for more cost-effective and environmentally friendly manufacturing technologies. However, the development of effective flow chemistry processes is still hampered by labor- and experiment-intensive methodologies and poor or suboptimal performance. In this context, integrating advanced machine learning strategies into chemical process optimization can significantly reduce experimental burdens and enhance overall efficiency. This paper demonstrates the capabilities of deep reinforcement learning (DRL) as an effective self-optimization strategy for imine synthesis in flow, a key building block in many compounds such as pharmaceuticals and heterocyclic products. A deep deterministic policy gradient (DDPG) agent was designed to iteratively interact with the environment, the flow reactor, and learn how to deliver optimal operating conditions. A mathematical model of the reactor was developed based on new experimental data to train the agent and evaluate alternative self-optimization strategies. To optimize the DDPG agent’s training performance, different hyperparameter tuning methods were investigated and compared, including trial-and-error and Bayesian optimization. Most importantly, a novel adaptive dynamic hyperparameter tuning was implemented to further enhance the training performance and optimization outcome of the agent. The performance of the proposed DRL strategy was compared against state-of-the-art gradient-free methods, namely SnobFit and Nelder–Mead. Finally, the outcomes of the different self-optimization strategies were tested experimentally. It was shown that the proposed DDPG agent has superior performance compared to its self-optimization counterparts. It offered better tracking of the global solution and reduced the number of required experiments by approximately 50 and 75% compared to Nelder–Mead and SnobFit, respectively. These findings hold significant promise for the chemical engineering community, offering a robust, efficient, and sustainable approach to optimizing flow chemistry processes and paving the way for broader integration of data-driven methods in process design and operation.

查看原文本刊更多论文

基于深度强化学习的流动化学自优化

开发有效的合成途径对许多工业部门至关重要。流动化学的日益普及为更具成本效益和环保的制造技术开辟了新的机会。然而，有效的流动化学过程的发展仍然受到劳动和实验密集型方法以及较差或次优性能的阻碍。在此背景下，将先进的机器学习策略整合到化工过程优化中，可以显著减少实验负担，提高整体效率。本文展示了深度强化学习（DRL）作为流动合成亚胺的有效自优化策略的能力，亚胺是许多化合物（如药物和杂环产品）的关键构建块。设计了一个深度确定性策略梯度（DDPG）代理，用于迭代地与环境、流动反应器交互，并学习如何提供最佳操作条件。基于新的实验数据，建立了反应器的数学模型来训练智能体并评估备选自优化策略。为了优化DDPG智能体的训练性能，研究并比较了不同的超参数调优方法，包括试错法和贝叶斯优化。最重要的是，实现了一种新的自适应动态超参数调整，进一步提高了智能体的训练性能和优化结果。将提出的DRL策略的性能与最先进的无梯度方法（即SnobFit和Nelder-Mead）进行了比较。最后，对不同自优化策略的结果进行了实验验证。结果表明，与自优化的DDPG试剂相比，所提出的DDPG试剂具有优越的性能。与Nelder-Mead和SnobFit相比，它提供了更好的全球解决方案跟踪，并将所需的实验次数分别减少了约50%和75%。这些发现为化学工程界带来了巨大的希望，为优化流动化学过程提供了一种强大、高效和可持续的方法，并为在过程设计和操作中更广泛地集成数据驱动方法铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACS Engineering Au 化学工程技术-

自引率

0.00%

发文量

期刊介绍：）ACS Engineering Au is an open access journal that reports significant advances in chemical engineering applied chemistry and energy covering fundamentals processes and products. The journal's broad scope includes experimental theoretical mathematical computational chemical and physical research from academic and industrial settings. Short letters comprehensive articles reviews and perspectives are welcome on topics that include:Fundamental research in such areas as thermodynamics transport phenomena (flow mixing mass & heat transfer) chemical reaction kinetics and engineering catalysis separations interfacial phenomena and materialsProcess design development and intensification (e.g. process technologies for chemicals and materials synthesis and design methods process intensification multiphase reactors scale-up systems analysis process control data correlation schemes modeling machine learning Artificial Intelligence)Product research and development involving chemical and engineering aspects (e.g. catalysts plastics elastomers fibers adhesives coatings paper membranes lubricants ceramics aerosols fluidic devices intensified process equipment)Energy and fuels (e.g. pre-treatment processing and utilization of renewable energy resources; processing and utilization of fuels; properties and structure or molecular composition of both raw fuels and refined products; fuel cells hydrogen batteries; photochemical fuel and energy production; decarbonization; electrification; microwave; cavitation)Measurement techniques computational models and data on thermo-physical thermodynamic and transport properties of materials and phase equilibrium behaviorNew methods models and tools (e.g. real-time data analytics multi-scale models physics informed machine learning models machine learning enhanced physics-based models soft sensors high-performance computing)