{"title":"Online Actor-critic Network Algorithm to Solve infinite-Horizon Optimal Tracking Control Problem for Discrete-time Systems","authors":"Mei Li, Zhong Ming, Jiayue Sun","doi":"10.1109/YAC57282.2022.10023621","DOIUrl":null,"url":null,"abstract":"A novel value function is cleverly defined in this paper to eliminate the tracking error for the infinite-Horizon optimal tracking control problem by the adaptive dynamic programming (ADP) technique. First, instead of using the original quadratic form, as in the existing ADP approaches, a novel formulation of the cost function is obtained by error accumulation. Second, the performance index and control input are approached by the critic-actor neural network (NN). Finally, the system state and network weight errors are uniformly ultimately bounded (UUB), according to theoretical studies. This is the first version of the stability analysis of the ADP method by constructing novel value function to eliminate tracking error. The theoretical arguments are also supported by a simulation outcome.","PeriodicalId":272227,"journal":{"name":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/YAC57282.2022.10023621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A novel value function is cleverly defined in this paper to eliminate the tracking error for the infinite-Horizon optimal tracking control problem by the adaptive dynamic programming (ADP) technique. First, instead of using the original quadratic form, as in the existing ADP approaches, a novel formulation of the cost function is obtained by error accumulation. Second, the performance index and control input are approached by the critic-actor neural network (NN). Finally, the system state and network weight errors are uniformly ultimately bounded (UUB), according to theoretical studies. This is the first version of the stability analysis of the ADP method by constructing novel value function to eliminate tracking error. The theoretical arguments are also supported by a simulation outcome.