Myisha A. Chowdhury, Saif S.S. Al-Wahaibi, Qiugang Lu
{"title":"基于熵最大化td3的自适应PID系统强化学习","authors":"Myisha A. Chowdhury, Saif S.S. Al-Wahaibi, Qiugang Lu","doi":"10.1016/j.compchemeng.2023.108393","DOIUrl":null,"url":null,"abstract":"<div><p>The proper tuning of proportional–integral–derivative (PID) control is critical for satisfactory control performance. However, existing tuning methods are often time-consuming and require system models that are difficult to obtain for complex processes. To this end, automatic PID tuning, particularly that based on deep reinforcement learning, eliminates the necessity of a system model by treating the PID tuning as a black-box optimization. However, these methods suffer from low sample efficiency. In this paper, we present an entropy-maximizing twin-delayed deep deterministic policy gradient (EMTD3) method for automatic PID tuning. In our method, an entropy-maximizing stochastic actor is deployed at the beginning to ensure sufficient explorations, followed by a deterministic actor to focus on local exploitation. Such a hybrid approach can enhance the sample efficiency to facilitate the PID tuning. Extensive simulation studies are provided to show the superior performance of the proposed method relative to other methods on data efficiency, adaptivity, and robustness.</p></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"178 ","pages":"Article 108393"},"PeriodicalIF":3.9000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Entropy-maximizing TD3-based reinforcement learning for adaptive PID control of dynamical systems\",\"authors\":\"Myisha A. Chowdhury, Saif S.S. Al-Wahaibi, Qiugang Lu\",\"doi\":\"10.1016/j.compchemeng.2023.108393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The proper tuning of proportional–integral–derivative (PID) control is critical for satisfactory control performance. However, existing tuning methods are often time-consuming and require system models that are difficult to obtain for complex processes. To this end, automatic PID tuning, particularly that based on deep reinforcement learning, eliminates the necessity of a system model by treating the PID tuning as a black-box optimization. However, these methods suffer from low sample efficiency. In this paper, we present an entropy-maximizing twin-delayed deep deterministic policy gradient (EMTD3) method for automatic PID tuning. In our method, an entropy-maximizing stochastic actor is deployed at the beginning to ensure sufficient explorations, followed by a deterministic actor to focus on local exploitation. Such a hybrid approach can enhance the sample efficiency to facilitate the PID tuning. Extensive simulation studies are provided to show the superior performance of the proposed method relative to other methods on data efficiency, adaptivity, and robustness.</p></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"178 \",\"pages\":\"Article 108393\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135423002636\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135423002636","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Entropy-maximizing TD3-based reinforcement learning for adaptive PID control of dynamical systems
The proper tuning of proportional–integral–derivative (PID) control is critical for satisfactory control performance. However, existing tuning methods are often time-consuming and require system models that are difficult to obtain for complex processes. To this end, automatic PID tuning, particularly that based on deep reinforcement learning, eliminates the necessity of a system model by treating the PID tuning as a black-box optimization. However, these methods suffer from low sample efficiency. In this paper, we present an entropy-maximizing twin-delayed deep deterministic policy gradient (EMTD3) method for automatic PID tuning. In our method, an entropy-maximizing stochastic actor is deployed at the beginning to ensure sufficient explorations, followed by a deterministic actor to focus on local exploitation. Such a hybrid approach can enhance the sample efficiency to facilitate the PID tuning. Extensive simulation studies are provided to show the superior performance of the proposed method relative to other methods on data efficiency, adaptivity, and robustness.
期刊介绍:
Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.