{"title":"多智能体强化学习中的局部协调","authors":"Fanchao Xu, Tomoyuki Kaneko","doi":"10.1109/taai54685.2021.00036","DOIUrl":null,"url":null,"abstract":"This paper studies cooperative multi-agent reinforcement learning problems where agents pursue a common goal through their cooperation. Because each agent needs to act individually on the basis on its local observation, the difficulty of learning depends on to what extent information can be exchanged among agents. We extend value-decomposition networks (VDN), a framework requiring the least communication, by allowing information exchange within a local group and present residual group VDN (RGV). We empirically show that the performance of RGV is better than VDN and other state-of-the-art methods in the predator-prey game. Also, on three tasks in the StarCraft Multi-Agent Challenge, RGV showed comparable performance with more sophisticated methods utilizing more information or communication. Therefore, our RGV is an alternative method worth further research.","PeriodicalId":343821,"journal":{"name":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Local Coordination in Multi-Agent Reinforcement Learning\",\"authors\":\"Fanchao Xu, Tomoyuki Kaneko\",\"doi\":\"10.1109/taai54685.2021.00036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper studies cooperative multi-agent reinforcement learning problems where agents pursue a common goal through their cooperation. Because each agent needs to act individually on the basis on its local observation, the difficulty of learning depends on to what extent information can be exchanged among agents. We extend value-decomposition networks (VDN), a framework requiring the least communication, by allowing information exchange within a local group and present residual group VDN (RGV). We empirically show that the performance of RGV is better than VDN and other state-of-the-art methods in the predator-prey game. Also, on three tasks in the StarCraft Multi-Agent Challenge, RGV showed comparable performance with more sophisticated methods utilizing more information or communication. Therefore, our RGV is an alternative method worth further research.\",\"PeriodicalId\":343821,\"journal\":{\"name\":\"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/taai54685.2021.00036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/taai54685.2021.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Local Coordination in Multi-Agent Reinforcement Learning
This paper studies cooperative multi-agent reinforcement learning problems where agents pursue a common goal through their cooperation. Because each agent needs to act individually on the basis on its local observation, the difficulty of learning depends on to what extent information can be exchanged among agents. We extend value-decomposition networks (VDN), a framework requiring the least communication, by allowing information exchange within a local group and present residual group VDN (RGV). We empirically show that the performance of RGV is better than VDN and other state-of-the-art methods in the predator-prey game. Also, on three tasks in the StarCraft Multi-Agent Challenge, RGV showed comparable performance with more sophisticated methods utilizing more information or communication. Therefore, our RGV is an alternative method worth further research.