GCMA: An Adaptive Multiagent Reinforcement Learning Framework With Group Communication for Complex and Similar Tasks Coordination

IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Kexing Peng;Tinghuai Ma;Xin Yu;Huan Rong;Yurong Qian;Najla Al-Nabhan
{"title":"GCMA: An Adaptive Multiagent Reinforcement Learning Framework With Group Communication for Complex and Similar Tasks Coordination","authors":"Kexing Peng;Tinghuai Ma;Xin Yu;Huan Rong;Yurong Qian;Najla Al-Nabhan","doi":"10.1109/TG.2023.3346394","DOIUrl":null,"url":null,"abstract":"Coordinating multiple agents with diverse tasks and changing goals without interference is a challenge. Multiagent reinforcement learning (MARL) aims to develop effective communication and joint policies using group learning. Some of the previous approaches required each agent to maintain a set of networks independently, resulting in no consideration of interactions. Joint communication work causes agents receiving information unrelated to their own tasks. Currently, agents with different task divisions are often grouped by action tendency, but this can lead to poor dynamic grouping. This article presents a two-phase solution for multiple agents, addressing these issues. The first phase develops heterogeneous agent communication joint policies using a group communication MARL framework (GCMA). The framework employs a periodic grouping strategy, reducing exploration and communication redundancy by dynamically assigning agent group hidden features through hypernetwork and graph communication. The scheme efficiently utilizes resources for adapting to multiple similar tasks. In the second phase, each agent's policy network is distilled into a generalized simple network, adapting to similar tasks with varying quantities and sizes. GCMA is tested in complex environments, such as \n<italic>StarCraft II</i>\n and unmanned aerial vehicle (UAV) take-off, showing its well-performing for large-scale, coordinated tasks. It shows GCMA's effectiveness for solid generalization in multitask tests with simulated pedestrians.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"670-682"},"PeriodicalIF":1.7000,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Games","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10373072/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Coordinating multiple agents with diverse tasks and changing goals without interference is a challenge. Multiagent reinforcement learning (MARL) aims to develop effective communication and joint policies using group learning. Some of the previous approaches required each agent to maintain a set of networks independently, resulting in no consideration of interactions. Joint communication work causes agents receiving information unrelated to their own tasks. Currently, agents with different task divisions are often grouped by action tendency, but this can lead to poor dynamic grouping. This article presents a two-phase solution for multiple agents, addressing these issues. The first phase develops heterogeneous agent communication joint policies using a group communication MARL framework (GCMA). The framework employs a periodic grouping strategy, reducing exploration and communication redundancy by dynamically assigning agent group hidden features through hypernetwork and graph communication. The scheme efficiently utilizes resources for adapting to multiple similar tasks. In the second phase, each agent's policy network is distilled into a generalized simple network, adapting to similar tasks with varying quantities and sizes. GCMA is tested in complex environments, such as StarCraft II and unmanned aerial vehicle (UAV) take-off, showing its well-performing for large-scale, coordinated tasks. It shows GCMA's effectiveness for solid generalization in multitask tests with simulated pedestrians.
GCMA:一种具有群组交流功能的自适应多代理强化学习框架,用于协调复杂和类似任务
协调具有不同任务和不断变化目标的多个代理而不受干扰是一项挑战。多代理强化学习(MARL)旨在利用群体学习开发有效的沟通和联合策略。以前的一些方法要求每个代理独立维护一组网络,导致无法考虑互动。联合通信工作会导致代理接收与自身任务无关的信息。目前,任务分工不同的代理通常按行动倾向分组,但这可能导致动态分组效果不佳。本文针对这些问题,提出了一种分两个阶段的多代理解决方案。第一阶段使用分组通信 MARL 框架(GCMA)开发异构代理通信联合策略。该框架采用周期性分组策略,通过超网络和图通信动态分配代理分组隐藏特征,从而减少探索和通信冗余。该方案能有效利用资源,适应多个类似任务。在第二阶段,每个代理的策略网络被提炼成一个广义的简单网络,以适应不同数量和规模的类似任务。GCMA 在《星际争霸 II》和无人机起飞等复杂环境中进行了测试,显示了其在大规模协调任务中的良好表现。在模拟行人的多任务测试中,它显示了 GCMA 在稳固泛化方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Games
IEEE Transactions on Games Engineering-Electrical and Electronic Engineering
CiteScore
4.60
自引率
8.70%
发文量
87
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信