Cooperative Behavior by Multi-agent Reinforcement Learning with Abstractive Communication

Jin Tanda, Ahmed Moustafa, Takayuki Ito
{"title":"Cooperative Behavior by Multi-agent Reinforcement Learning with Abstractive Communication","authors":"Jin Tanda, Ahmed Moustafa, Takayuki Ito","doi":"10.1109/AGENTS.2019.8929151","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) is a major area of machine learning that aims to develop intelligent agents that are able to adapt in random environments appropriately. In this regard, RL has shown good results when applied to complex tasks such as playing video games. In addition, in multi-agent environments, RL has shown strong potential especially with the recent developments. However, there exist few studies that focus on developing cooperation among learning agents. In general, cooperative behavior among learning agents shows higher performance than independent agent behavior. Therefore, in this research, we focus on the cooperative behavior on Predator-Prey game in a continuous space, which is widely used as one of the typical simulation of Multi-agent environment. Especially we focus on predators that their goal is to catch a prey. We propose Leader-Follower model as the organization of predators, and investigate how they cooperate with each other to achieve their goal considering the prey’s policy using a model of RL. The results of our work indicate that a communication between Leader and Followers affects high performance. In addition, we acquire an interesting result as a process of achieving their goal. We investigate the movement locus of them in three cases which is different reward settings, and in each case, they take different policy depending on the reward. We visualize the movement of locus, and discuss about their cooperation and effectiveness.","PeriodicalId":235878,"journal":{"name":"2019 IEEE International Conference on Agents (ICA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Agents (ICA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AGENTS.2019.8929151","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Reinforcement learning (RL) is a major area of machine learning that aims to develop intelligent agents that are able to adapt in random environments appropriately. In this regard, RL has shown good results when applied to complex tasks such as playing video games. In addition, in multi-agent environments, RL has shown strong potential especially with the recent developments. However, there exist few studies that focus on developing cooperation among learning agents. In general, cooperative behavior among learning agents shows higher performance than independent agent behavior. Therefore, in this research, we focus on the cooperative behavior on Predator-Prey game in a continuous space, which is widely used as one of the typical simulation of Multi-agent environment. Especially we focus on predators that their goal is to catch a prey. We propose Leader-Follower model as the organization of predators, and investigate how they cooperate with each other to achieve their goal considering the prey’s policy using a model of RL. The results of our work indicate that a communication between Leader and Followers affects high performance. In addition, we acquire an interesting result as a process of achieving their goal. We investigate the movement locus of them in three cases which is different reward settings, and in each case, they take different policy depending on the reward. We visualize the movement of locus, and discuss about their cooperation and effectiveness.
基于抽象通信的多智能体强化学习的合作行为
强化学习(RL)是机器学习的一个主要领域,旨在开发能够适当适应随机环境的智能代理。在这方面,强化学习在应用于复杂任务(如玩电子游戏)时显示出良好的效果。此外,在多智能体环境中,强化学习已经显示出强大的潜力,特别是随着最近的发展。然而,关于学习主体间合作发展的研究却很少。总体而言,学习智能体之间的合作行为比独立智能体表现出更高的绩效。因此,在本研究中,我们重点研究了连续空间中捕食者-猎物博弈中的合作行为,这是广泛使用的多智能体环境的典型模拟之一。我们特别关注那些以捕获猎物为目标的捕食者。我们提出了Leader-Follower模型作为捕食者的组织,并利用RL模型研究了它们如何在考虑猎物策略的情况下相互合作以实现目标。我们的研究结果表明,领导者和追随者之间的沟通影响高绩效。此外,我们在实现目标的过程中获得了一个有趣的结果。我们研究了三种不同奖励设置下它们的运动轨迹,在每种情况下,它们根据不同的奖励采取不同的策略。对轨迹运动进行可视化,并讨论轨迹运动的协同性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信