{"title":"A new approach for structural credit assignment in distributed reinforcement learning systems","authors":"Zhong Yu, Gu Guo-chang, Zhang Rubo","doi":"10.1109/ROBOT.2003.1241758","DOIUrl":null,"url":null,"abstract":"Most existing algorithm for structural credit assignment are developed for competitive reinforcement learning systems. In competitive reinforcement learning system, agents are activated one by one, so there is only one active agent at a time and structural credit assignment could be implemented by some temporal credit assignment algorithms. In collaborated reinforcement learning systems, agents are activated simultaneously, so how to transform the global reinforcement signal fed back from the environment to a reinforcement vector is a crucial difficulty that could not be slide over. In this article, the first really feasible and efficient structural credit assignment difficulty in collaborated reinforcement learning systems is primarily solved. The experiments show that the algorithm converges very rapidly and the assignment result is quite satisfying.","PeriodicalId":315346,"journal":{"name":"2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBOT.2003.1241758","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Most existing algorithm for structural credit assignment are developed for competitive reinforcement learning systems. In competitive reinforcement learning system, agents are activated one by one, so there is only one active agent at a time and structural credit assignment could be implemented by some temporal credit assignment algorithms. In collaborated reinforcement learning systems, agents are activated simultaneously, so how to transform the global reinforcement signal fed back from the environment to a reinforcement vector is a crucial difficulty that could not be slide over. In this article, the first really feasible and efficient structural credit assignment difficulty in collaborated reinforcement learning systems is primarily solved. The experiments show that the algorithm converges very rapidly and the assignment result is quite satisfying.