{"title":"GDT: Multi-agent reinforcement learning framework based on adaptive grouping dynamic topological space","authors":"Licheng Sun , Hongbin Ma , Zhentao Guo","doi":"10.1016/j.ins.2024.121646","DOIUrl":null,"url":null,"abstract":"<div><div>In many real-world scenarios, tasks involve coordinating multiple agents, such as managing robot clusters, drone swarms, and autonomous vehicles. These tasks are commonly addressed using Multi-Agent Reinforcement Learning (MARL). However, existing MARL algorithms often lack foresight regarding the number and types of agents involved, requiring agents to generalize across various task configurations. This may lead to suboptimal performance due to underestimated action values and the selection of less effective joint policies. To address these challenges, we propose a novel multi-agent deep reinforcement learning framework, called multi-agent reinforcement learning framework based on adaptive grouping dynamic topological space (GDT). GDT utilizes a group mesh topology to interconnect the local action value functions of each agent, enabling effective coordination and knowledge sharing among agents. By computing three different interpretations of action value functions, GDT overcomes monotonicity constraints and derives more effective overall action value functions. Additionally, GDT groups agents with high similarity to facilitate parameter sharing, thereby enhancing knowledge transfer and generalization across different scenarios. Furthermore, GDT introduces a strategy regularization method for optimal exploration of multiple action spaces. This method assigns each agent an independent entropy temperature during exploration, enabling agents to efficiently explore potential actions and approximate total state values. Experimental results demonstrate that our approach, termed GDT, significantly outperforms state-of-the-art algorithms on Google Research Football (GRF) and the StarCraft Multi-Agent Challenge (SMAC). Particularly in SMAC tasks, GDT achieves a success rate of nearly 100% across almost all Hard Map and Super Hard Map scenarios. Additionally, we validate the effectiveness of our algorithm on Non-monotonic Matrix Games.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"691 ","pages":"Article 121646"},"PeriodicalIF":8.1000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025524015603","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In many real-world scenarios, tasks involve coordinating multiple agents, such as managing robot clusters, drone swarms, and autonomous vehicles. These tasks are commonly addressed using Multi-Agent Reinforcement Learning (MARL). However, existing MARL algorithms often lack foresight regarding the number and types of agents involved, requiring agents to generalize across various task configurations. This may lead to suboptimal performance due to underestimated action values and the selection of less effective joint policies. To address these challenges, we propose a novel multi-agent deep reinforcement learning framework, called multi-agent reinforcement learning framework based on adaptive grouping dynamic topological space (GDT). GDT utilizes a group mesh topology to interconnect the local action value functions of each agent, enabling effective coordination and knowledge sharing among agents. By computing three different interpretations of action value functions, GDT overcomes monotonicity constraints and derives more effective overall action value functions. Additionally, GDT groups agents with high similarity to facilitate parameter sharing, thereby enhancing knowledge transfer and generalization across different scenarios. Furthermore, GDT introduces a strategy regularization method for optimal exploration of multiple action spaces. This method assigns each agent an independent entropy temperature during exploration, enabling agents to efficiently explore potential actions and approximate total state values. Experimental results demonstrate that our approach, termed GDT, significantly outperforms state-of-the-art algorithms on Google Research Football (GRF) and the StarCraft Multi-Agent Challenge (SMAC). Particularly in SMAC tasks, GDT achieves a success rate of nearly 100% across almost all Hard Map and Super Hard Map scenarios. Additionally, we validate the effectiveness of our algorithm on Non-monotonic Matrix Games.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.