{"title":"一种改进的批处理强化学习控制策略","authors":"P. Zhang, Jie Zhang, Yang Long, Bingzhang Hu","doi":"10.1109/MMAR.2019.8864632","DOIUrl":null,"url":null,"abstract":"Batch processes are significant and essential manufacturing route for the agile manufacturing of high value added products and they are typically difficult to control because of unknown disturbances, model plant mismatches, and highly nonlinear characteristic. Traditional one-step reinforcement learning and neural network have been applied to optimize and control batch processes. However, traditional one-step reinforcement learning and the neural network lack accuracy and robustness leading to unsatisfactory performance. To overcome these issues and difficulties, a modified multi-step action Q-learning algorithm (MMSA) based on multiple step action Q-learning (MSA) is proposed in this paper. For MSA, the action space is divided into some periods of same time steps and the same action is explored with fixed greedy policy being applied continuously during a period. Compared with MSA, the modification of MMSA is that the exploration and selection of action will follow an improved and various greedy policy in the whole system time which can improve the flexibility and speed of the learning algorithm. The proposed algorithm is applied to a highly nonlinear batch process and it is shown giving better control performance than the traditional one-step reinforcement learning and MSA.","PeriodicalId":392498,"journal":{"name":"2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An improved reinforcement learning control strategy for batch processes\",\"authors\":\"P. Zhang, Jie Zhang, Yang Long, Bingzhang Hu\",\"doi\":\"10.1109/MMAR.2019.8864632\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Batch processes are significant and essential manufacturing route for the agile manufacturing of high value added products and they are typically difficult to control because of unknown disturbances, model plant mismatches, and highly nonlinear characteristic. Traditional one-step reinforcement learning and neural network have been applied to optimize and control batch processes. However, traditional one-step reinforcement learning and the neural network lack accuracy and robustness leading to unsatisfactory performance. To overcome these issues and difficulties, a modified multi-step action Q-learning algorithm (MMSA) based on multiple step action Q-learning (MSA) is proposed in this paper. For MSA, the action space is divided into some periods of same time steps and the same action is explored with fixed greedy policy being applied continuously during a period. Compared with MSA, the modification of MMSA is that the exploration and selection of action will follow an improved and various greedy policy in the whole system time which can improve the flexibility and speed of the learning algorithm. The proposed algorithm is applied to a highly nonlinear batch process and it is shown giving better control performance than the traditional one-step reinforcement learning and MSA.\",\"PeriodicalId\":392498,\"journal\":{\"name\":\"2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMAR.2019.8864632\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMAR.2019.8864632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An improved reinforcement learning control strategy for batch processes
Batch processes are significant and essential manufacturing route for the agile manufacturing of high value added products and they are typically difficult to control because of unknown disturbances, model plant mismatches, and highly nonlinear characteristic. Traditional one-step reinforcement learning and neural network have been applied to optimize and control batch processes. However, traditional one-step reinforcement learning and the neural network lack accuracy and robustness leading to unsatisfactory performance. To overcome these issues and difficulties, a modified multi-step action Q-learning algorithm (MMSA) based on multiple step action Q-learning (MSA) is proposed in this paper. For MSA, the action space is divided into some periods of same time steps and the same action is explored with fixed greedy policy being applied continuously during a period. Compared with MSA, the modification of MMSA is that the exploration and selection of action will follow an improved and various greedy policy in the whole system time which can improve the flexibility and speed of the learning algorithm. The proposed algorithm is applied to a highly nonlinear batch process and it is shown giving better control performance than the traditional one-step reinforcement learning and MSA.