{"title":"基于抽象通信的多智能体强化学习的合作行为","authors":"Jin Tanda, Ahmed Moustafa, Takayuki Ito","doi":"10.1109/AGENTS.2019.8929151","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) is a major area of machine learning that aims to develop intelligent agents that are able to adapt in random environments appropriately. In this regard, RL has shown good results when applied to complex tasks such as playing video games. In addition, in multi-agent environments, RL has shown strong potential especially with the recent developments. However, there exist few studies that focus on developing cooperation among learning agents. In general, cooperative behavior among learning agents shows higher performance than independent agent behavior. Therefore, in this research, we focus on the cooperative behavior on Predator-Prey game in a continuous space, which is widely used as one of the typical simulation of Multi-agent environment. Especially we focus on predators that their goal is to catch a prey. We propose Leader-Follower model as the organization of predators, and investigate how they cooperate with each other to achieve their goal considering the prey’s policy using a model of RL. The results of our work indicate that a communication between Leader and Followers affects high performance. In addition, we acquire an interesting result as a process of achieving their goal. We investigate the movement locus of them in three cases which is different reward settings, and in each case, they take different policy depending on the reward. We visualize the movement of locus, and discuss about their cooperation and effectiveness.","PeriodicalId":235878,"journal":{"name":"2019 IEEE International Conference on Agents (ICA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cooperative Behavior by Multi-agent Reinforcement Learning with Abstractive Communication\",\"authors\":\"Jin Tanda, Ahmed Moustafa, Takayuki Ito\",\"doi\":\"10.1109/AGENTS.2019.8929151\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL) is a major area of machine learning that aims to develop intelligent agents that are able to adapt in random environments appropriately. In this regard, RL has shown good results when applied to complex tasks such as playing video games. In addition, in multi-agent environments, RL has shown strong potential especially with the recent developments. However, there exist few studies that focus on developing cooperation among learning agents. In general, cooperative behavior among learning agents shows higher performance than independent agent behavior. Therefore, in this research, we focus on the cooperative behavior on Predator-Prey game in a continuous space, which is widely used as one of the typical simulation of Multi-agent environment. Especially we focus on predators that their goal is to catch a prey. We propose Leader-Follower model as the organization of predators, and investigate how they cooperate with each other to achieve their goal considering the prey’s policy using a model of RL. The results of our work indicate that a communication between Leader and Followers affects high performance. In addition, we acquire an interesting result as a process of achieving their goal. We investigate the movement locus of them in three cases which is different reward settings, and in each case, they take different policy depending on the reward. We visualize the movement of locus, and discuss about their cooperation and effectiveness.\",\"PeriodicalId\":235878,\"journal\":{\"name\":\"2019 IEEE International Conference on Agents (ICA)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conference on Agents (ICA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AGENTS.2019.8929151\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Agents (ICA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AGENTS.2019.8929151","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cooperative Behavior by Multi-agent Reinforcement Learning with Abstractive Communication
Reinforcement learning (RL) is a major area of machine learning that aims to develop intelligent agents that are able to adapt in random environments appropriately. In this regard, RL has shown good results when applied to complex tasks such as playing video games. In addition, in multi-agent environments, RL has shown strong potential especially with the recent developments. However, there exist few studies that focus on developing cooperation among learning agents. In general, cooperative behavior among learning agents shows higher performance than independent agent behavior. Therefore, in this research, we focus on the cooperative behavior on Predator-Prey game in a continuous space, which is widely used as one of the typical simulation of Multi-agent environment. Especially we focus on predators that their goal is to catch a prey. We propose Leader-Follower model as the organization of predators, and investigate how they cooperate with each other to achieve their goal considering the prey’s policy using a model of RL. The results of our work indicate that a communication between Leader and Followers affects high performance. In addition, we acquire an interesting result as a process of achieving their goal. We investigate the movement locus of them in three cases which is different reward settings, and in each case, they take different policy depending on the reward. We visualize the movement of locus, and discuss about their cooperation and effectiveness.