{"title":"Self-improving Q-learning based controller for a class of dynamical processes","authors":"J. Musial, K. Stebel, Jacek Czeczot","doi":"10.24425/acs.2021.138691","DOIUrl":null,"url":null,"abstract":"This paper presents how Q-learning algorithm can be applied as a general-purpose selfimproving controller for use in industrial automation as a substitute for conventional PI controller implemented without proper tuning. Traditional Q-learning approach is redefined to better fit the applications in practical control loops, including new definition of the goal state by the closed loop reference trajectory and discretization of state space and accessible actions (manipulating variables). Properties of Q-learning algorithm are investigated in terms of practical applicability with a special emphasis on initializing of Q-matrix based only on preliminary PI tunings to ensure bumpless switching between existing controller and replacing Q-learning algorithm. A general approach for design of Q-matrix and learning policy is suggested and the concept is systematically validated by simulation in the application to control two examples of processes exhibiting first order dynamics and oscillatory second order dynamics. Results show that online learning using interaction with controlled process is possible and it ensures significant improvement in control performance compared to arbitrarily tuned PI controller.","PeriodicalId":48654,"journal":{"name":"Archives of Control Sciences","volume":"80 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Control Sciences","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.24425/acs.2021.138691","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 1
Abstract
This paper presents how Q-learning algorithm can be applied as a general-purpose selfimproving controller for use in industrial automation as a substitute for conventional PI controller implemented without proper tuning. Traditional Q-learning approach is redefined to better fit the applications in practical control loops, including new definition of the goal state by the closed loop reference trajectory and discretization of state space and accessible actions (manipulating variables). Properties of Q-learning algorithm are investigated in terms of practical applicability with a special emphasis on initializing of Q-matrix based only on preliminary PI tunings to ensure bumpless switching between existing controller and replacing Q-learning algorithm. A general approach for design of Q-matrix and learning policy is suggested and the concept is systematically validated by simulation in the application to control two examples of processes exhibiting first order dynamics and oscillatory second order dynamics. Results show that online learning using interaction with controlled process is possible and it ensures significant improvement in control performance compared to arbitrarily tuned PI controller.
期刊介绍:
Archives of Control Sciences welcomes for consideration papers on topics of significance in broadly understood control science and related areas, including: basic control theory, optimal control, optimization methods, control of complex systems, mathematical modeling of dynamic and control systems, expert and decision support systems and diverse methods of knowledge modelling and representing uncertainty (by stochastic, set-valued, fuzzy or rough set methods, etc.), robotics and flexible manufacturing systems. Related areas that are covered include information technology, parallel and distributed computations, neural networks and mathematical biomedicine, mathematical economics, applied game theory, financial engineering, business informatics and other similar fields.