{"title":"Decomposition Based Multi-Objective Evolutionary Algorithm in XCS for Multi-Objective Reinforcement Learning","authors":"Xiu Cheng, Will N. Browne, Mengjie Zhang","doi":"10.1109/CEC.2018.8477931","DOIUrl":null,"url":null,"abstract":"Learning Classifier Systems (LCSs) have been widely used to tackle Reinforcement Learning (RL) problems as they have a good generalization ability and provide a simple understandable rule-based solution. The accuracy-based LCS, XCS, has been most popularly used for single-objective RL problems. As many real-world problems exhibit multiple conflicting objectives recent work has sought to adapt XCS to Multi-Objective Reinforcement Learning (MORL) tasks. However, many of these algorithms need large storage or cannot discover the Pareto Optimal solutions. This is due to the complexity of finding a policy having multiple steps to multiple possible objectives. This paper aims to employ a decomposition strategy based on MOEA/D in XCS to approximate complex Pareto Fronts. In order to achieve multi-objective learning, a new MORL algorithm has been developed based on XCS and MOEA/D. The experimental results show that on complex bi-objective maze problems our MORL algorithm is able to learn a group of Pareto optimal solutions for MORL problems without huge storage. Analysis of the learned policies shows successful trade-offs between the distance to the reward versus the amount of reward itself.","PeriodicalId":212677,"journal":{"name":"2018 IEEE Congress on Evolutionary Computation (CEC)","volume":"159 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Congress on Evolutionary Computation (CEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2018.8477931","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Learning Classifier Systems (LCSs) have been widely used to tackle Reinforcement Learning (RL) problems as they have a good generalization ability and provide a simple understandable rule-based solution. The accuracy-based LCS, XCS, has been most popularly used for single-objective RL problems. As many real-world problems exhibit multiple conflicting objectives recent work has sought to adapt XCS to Multi-Objective Reinforcement Learning (MORL) tasks. However, many of these algorithms need large storage or cannot discover the Pareto Optimal solutions. This is due to the complexity of finding a policy having multiple steps to multiple possible objectives. This paper aims to employ a decomposition strategy based on MOEA/D in XCS to approximate complex Pareto Fronts. In order to achieve multi-objective learning, a new MORL algorithm has been developed based on XCS and MOEA/D. The experimental results show that on complex bi-objective maze problems our MORL algorithm is able to learn a group of Pareto optimal solutions for MORL problems without huge storage. Analysis of the learned policies shows successful trade-offs between the distance to the reward versus the amount of reward itself.