{"title":"用于草酸钴合成工艺批次间优化的离线约束强化学习","authors":"","doi":"10.1016/j.cherd.2024.08.013","DOIUrl":null,"url":null,"abstract":"<div><p>The cobalt oxalate synthesis, a batch process, plays a crucial role in the refinement of cobalt metal. The mean particle size of cobalt oxalate is a critical indicator that reflects product quality. However, excessive ammonium oxalate solution flow can heighten waste disposal costs in the production process. To address these issues, we propose a novel offline reinforcement learning (RL) algorithm that guarantees compliance with constraints in the cobalt oxalate synthesis process, utilizing exclusively static datasets. This method employs cost critic networks to assess costs, transforming the constrained optimization problem into an unconstrained one by introducing Lagrangian multipliers. We use exponential moving average (EMA) to optimize the update of proportional integral derivative (PID) control multipliers, reduce overshoot and oscillation in the control process, and thus improve the overall stability of the system. Furthermore, to optimize algorithm performance, a deep residual network (DResNet) is integrated into the policy network. Experimental results indicate that the algorithm’s optimization policy performs significantly better under constraints.</p></div>","PeriodicalId":10019,"journal":{"name":"Chemical Engineering Research & Design","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Offline constrained reinforcement learning for batch-to-batch optimization of cobalt oxalate synthesis process\",\"authors\":\"\",\"doi\":\"10.1016/j.cherd.2024.08.013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The cobalt oxalate synthesis, a batch process, plays a crucial role in the refinement of cobalt metal. The mean particle size of cobalt oxalate is a critical indicator that reflects product quality. However, excessive ammonium oxalate solution flow can heighten waste disposal costs in the production process. To address these issues, we propose a novel offline reinforcement learning (RL) algorithm that guarantees compliance with constraints in the cobalt oxalate synthesis process, utilizing exclusively static datasets. This method employs cost critic networks to assess costs, transforming the constrained optimization problem into an unconstrained one by introducing Lagrangian multipliers. We use exponential moving average (EMA) to optimize the update of proportional integral derivative (PID) control multipliers, reduce overshoot and oscillation in the control process, and thus improve the overall stability of the system. Furthermore, to optimize algorithm performance, a deep residual network (DResNet) is integrated into the policy network. Experimental results indicate that the algorithm’s optimization policy performs significantly better under constraints.</p></div>\",\"PeriodicalId\":10019,\"journal\":{\"name\":\"Chemical Engineering Research & Design\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemical Engineering Research & Design\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S026387622400491X\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemical Engineering Research & Design","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S026387622400491X","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
Offline constrained reinforcement learning for batch-to-batch optimization of cobalt oxalate synthesis process
The cobalt oxalate synthesis, a batch process, plays a crucial role in the refinement of cobalt metal. The mean particle size of cobalt oxalate is a critical indicator that reflects product quality. However, excessive ammonium oxalate solution flow can heighten waste disposal costs in the production process. To address these issues, we propose a novel offline reinforcement learning (RL) algorithm that guarantees compliance with constraints in the cobalt oxalate synthesis process, utilizing exclusively static datasets. This method employs cost critic networks to assess costs, transforming the constrained optimization problem into an unconstrained one by introducing Lagrangian multipliers. We use exponential moving average (EMA) to optimize the update of proportional integral derivative (PID) control multipliers, reduce overshoot and oscillation in the control process, and thus improve the overall stability of the system. Furthermore, to optimize algorithm performance, a deep residual network (DResNet) is integrated into the policy network. Experimental results indicate that the algorithm’s optimization policy performs significantly better under constraints.
期刊介绍:
ChERD aims to be the principal international journal for publication of high quality, original papers in chemical engineering.
Papers showing how research results can be used in chemical engineering design, and accounts of experimental or theoretical research work bringing new perspectives to established principles, highlighting unsolved problems or indicating directions for future research, are particularly welcome. Contributions that deal with new developments in plant or processes and that can be given quantitative expression are encouraged. The journal is especially interested in papers that extend the boundaries of traditional chemical engineering.