{"title":"Evolutionary adaptive-critic methods for reinforcement learning","authors":"Xin Xu, Han-gen He, D. Hu","doi":"10.1109/CEC.2002.1004434","DOIUrl":null,"url":null,"abstract":"In this paper, a novel hybrid learning method is proposed for reinforcement learning problems with continuous state and action spaces. The reinforcement learning problems are modeled as Markov decision processes (MDPs) and the hybrid learning method combines evolutionary algorithms with gradient-based adaptive heuristic critic (AHC) algorithms to approximate the optimal policy of MDPs. The suggested method takes the advantages of evolutionary learning and gradient-based reinforcement learning to solve reinforcement learning problems. Simulation results on the learning control of an acrobot illustrate the efficiency of the presented method.","PeriodicalId":184547,"journal":{"name":"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2002.1004434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this paper, a novel hybrid learning method is proposed for reinforcement learning problems with continuous state and action spaces. The reinforcement learning problems are modeled as Markov decision processes (MDPs) and the hybrid learning method combines evolutionary algorithms with gradient-based adaptive heuristic critic (AHC) algorithms to approximate the optimal policy of MDPs. The suggested method takes the advantages of evolutionary learning and gradient-based reinforcement learning to solve reinforcement learning problems. Simulation results on the learning control of an acrobot illustrate the efficiency of the presented method.