{"title":"An Efficient Reinforcement Learning Approach to Optimal Control with Application to Biodiesel Production","authors":"Shiam Kannan , Urmila Diwekar","doi":"10.1016/j.compchemeng.2023.108258","DOIUrl":null,"url":null,"abstract":"<div><p>Optimal control problems are one of the most challenging problems in optimization. This paper presents a new and efficient Reinforcement Learning approach to optimal control problems based on the Batch Q-learning algorithm. To improve the convergence of the RL algorithm, we use k-dimensional uniformity of advanced sampling procedures, namely employing Hamersley sequences (HSS). HSS is used to randomly sample the state variables and discrete controls from the action space for the RL optimal control problem. The Neural-fitted Q-iterative algorithm is applied to solve an optimal control problem for a first-order state dynamical system. A real-world application of optimal temperature profile determination for biodiesel production in a batch reactor is presented. We present the comparison of our HSS-RL algorithm with that of the maximum principle.</p></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"174 ","pages":"Article 108258"},"PeriodicalIF":3.9000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009813542300128X","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Optimal control problems are one of the most challenging problems in optimization. This paper presents a new and efficient Reinforcement Learning approach to optimal control problems based on the Batch Q-learning algorithm. To improve the convergence of the RL algorithm, we use k-dimensional uniformity of advanced sampling procedures, namely employing Hamersley sequences (HSS). HSS is used to randomly sample the state variables and discrete controls from the action space for the RL optimal control problem. The Neural-fitted Q-iterative algorithm is applied to solve an optimal control problem for a first-order state dynamical system. A real-world application of optimal temperature profile determination for biodiesel production in a batch reactor is presented. We present the comparison of our HSS-RL algorithm with that of the maximum principle.
期刊介绍:
Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.