{"title":"一种高效的强化学习方法在生物柴油生产中的应用","authors":"Shiam Kannan , Urmila Diwekar","doi":"10.1016/j.compchemeng.2023.108258","DOIUrl":null,"url":null,"abstract":"<div><p>Optimal control problems are one of the most challenging problems in optimization. This paper presents a new and efficient Reinforcement Learning approach to optimal control problems based on the Batch Q-learning algorithm. To improve the convergence of the RL algorithm, we use k-dimensional uniformity of advanced sampling procedures, namely employing Hamersley sequences (HSS). HSS is used to randomly sample the state variables and discrete controls from the action space for the RL optimal control problem. The Neural-fitted Q-iterative algorithm is applied to solve an optimal control problem for a first-order state dynamical system. A real-world application of optimal temperature profile determination for biodiesel production in a batch reactor is presented. We present the comparison of our HSS-RL algorithm with that of the maximum principle.</p></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"174 ","pages":"Article 108258"},"PeriodicalIF":3.9000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Efficient Reinforcement Learning Approach to Optimal Control with Application to Biodiesel Production\",\"authors\":\"Shiam Kannan , Urmila Diwekar\",\"doi\":\"10.1016/j.compchemeng.2023.108258\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Optimal control problems are one of the most challenging problems in optimization. This paper presents a new and efficient Reinforcement Learning approach to optimal control problems based on the Batch Q-learning algorithm. To improve the convergence of the RL algorithm, we use k-dimensional uniformity of advanced sampling procedures, namely employing Hamersley sequences (HSS). HSS is used to randomly sample the state variables and discrete controls from the action space for the RL optimal control problem. The Neural-fitted Q-iterative algorithm is applied to solve an optimal control problem for a first-order state dynamical system. A real-world application of optimal temperature profile determination for biodiesel production in a batch reactor is presented. We present the comparison of our HSS-RL algorithm with that of the maximum principle.</p></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"174 \",\"pages\":\"Article 108258\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S009813542300128X\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009813542300128X","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
An Efficient Reinforcement Learning Approach to Optimal Control with Application to Biodiesel Production
Optimal control problems are one of the most challenging problems in optimization. This paper presents a new and efficient Reinforcement Learning approach to optimal control problems based on the Batch Q-learning algorithm. To improve the convergence of the RL algorithm, we use k-dimensional uniformity of advanced sampling procedures, namely employing Hamersley sequences (HSS). HSS is used to randomly sample the state variables and discrete controls from the action space for the RL optimal control problem. The Neural-fitted Q-iterative algorithm is applied to solve an optimal control problem for a first-order state dynamical system. A real-world application of optimal temperature profile determination for biodiesel production in a batch reactor is presented. We present the comparison of our HSS-RL algorithm with that of the maximum principle.
期刊介绍:
Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.