{"title":"Discrete-time optimal control scheme based on Q-learning algorithm","authors":"Qinglai Wei, Derong Liu, Ruizhuo Song","doi":"10.1109/ICICIP.2016.7885888","DOIUrl":null,"url":null,"abstract":"This paper is concerned with optimal control problems of discrete-time nonlinear systems via a novel Q-learning algorithm. In the newly developed Q-learning algorithm, the iterative Q function in each iteration is required to update on the whole state and control spaces, instead of being updated by a single state and control pair. A new convergence criterion of the corresponding Q-learning algorithm is presented, where the traditional constraints for the learning rates of Q-learning algorithms is relaxed. Finally, simulation results are provided to exemplify the good performance of the developed algorithm.","PeriodicalId":226381,"journal":{"name":"2016 Seventh International Conference on Intelligent Control and Information Processing (ICICIP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Seventh International Conference on Intelligent Control and Information Processing (ICICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICIP.2016.7885888","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper is concerned with optimal control problems of discrete-time nonlinear systems via a novel Q-learning algorithm. In the newly developed Q-learning algorithm, the iterative Q function in each iteration is required to update on the whole state and control spaces, instead of being updated by a single state and control pair. A new convergence criterion of the corresponding Q-learning algorithm is presented, where the traditional constraints for the learning rates of Q-learning algorithms is relaxed. Finally, simulation results are provided to exemplify the good performance of the developed algorithm.