基于局部信息聚合的多智能体强化学习在机器人群动态任务分配中的应用

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-04-23 DOI:10.1109/TNNLS.2025.3558282

Yang Lv;Jinlong Lei;Peng Yi

{"title":"基于局部信息聚合的多智能体强化学习在机器人群动态任务分配中的应用","authors":"Yang Lv;Jinlong Lei;Peng Yi","doi":"10.1109/TNNLS.2025.3558282","DOIUrl":null,"url":null,"abstract":"In this article, we explore how to optimize task allocation for robot swarms in dynamic environments, emphasizing the necessity of formulating robust, flexible, and scalable strategies for robot cooperation. We introduce a novel framework using a decentralized partially observable Markov decision process (Dec-POMDP), specifically designed for distributed robot swarm networks. At the core of our methodology is the local information aggregation multiagent deep deterministic policy gradient (LIA-MADDPG) algorithm, which merges centralized training with distributed execution. During the centralized training phase, a local information aggregation (LIA) module is meticulously designed to gather critical data from neighboring robots, enhancing decision-making efficiency. In the distributed execution phase, a strategy improvement method is proposed to dynamically adjust task allocation based on changing and partially observable environmental conditions. Our empirical evaluations show that the LIA module can be seamlessly integrated into various centralized training and decentralized execution (CTDE)-based multiagent reinforcement learning (MARL) methods, significantly enhancing their performance. Additionally, by comparing LIA-MADDPG with six conventional reinforcement learning algorithms and a heuristic algorithm, we demonstrate its superior scalability, rapid adaptation to environmental changes, and ability to maintain both stability and convergence speed. These results underscore LIA-MADDPG’s outstanding performance and its potential to significantly improve dynamic task allocation in robot swarms through enhanced local collaboration and adaptive strategy execution.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 6","pages":"10437-10449"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Local Information Aggregation-Based Multiagent Reinforcement Learning for Robot Swarm Dynamic Task Allocation\",\"authors\":\"Yang Lv;Jinlong Lei;Peng Yi\",\"doi\":\"10.1109/TNNLS.2025.3558282\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, we explore how to optimize task allocation for robot swarms in dynamic environments, emphasizing the necessity of formulating robust, flexible, and scalable strategies for robot cooperation. We introduce a novel framework using a decentralized partially observable Markov decision process (Dec-POMDP), specifically designed for distributed robot swarm networks. At the core of our methodology is the local information aggregation multiagent deep deterministic policy gradient (LIA-MADDPG) algorithm, which merges centralized training with distributed execution. During the centralized training phase, a local information aggregation (LIA) module is meticulously designed to gather critical data from neighboring robots, enhancing decision-making efficiency. In the distributed execution phase, a strategy improvement method is proposed to dynamically adjust task allocation based on changing and partially observable environmental conditions. Our empirical evaluations show that the LIA module can be seamlessly integrated into various centralized training and decentralized execution (CTDE)-based multiagent reinforcement learning (MARL) methods, significantly enhancing their performance. Additionally, by comparing LIA-MADDPG with six conventional reinforcement learning algorithms and a heuristic algorithm, we demonstrate its superior scalability, rapid adaptation to environmental changes, and ability to maintain both stability and convergence speed. These results underscore LIA-MADDPG’s outstanding performance and its potential to significantly improve dynamic task allocation in robot swarms through enhanced local collaboration and adaptive strategy execution.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"36 6\",\"pages\":\"10437-10449\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10974732/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10974732/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们探讨了如何在动态环境中优化机器人群的任务分配，强调了为机器人合作制定稳健、灵活和可扩展策略的必要性。我们引入了一个新的框架，使用分散的部分可观察马尔可夫决策过程（Dec-POMDP），专门为分布式机器人群网络设计。该方法的核心是局部信息聚合多智能体深度确定性策略梯度（LIA-MADDPG）算法，该算法将集中式训练与分布式执行相结合。在集中训练阶段，精心设计局部信息聚合（LIA）模块，从邻近机器人收集关键数据，提高决策效率。在分布式执行阶段，提出了一种基于变化和部分可观察的环境条件动态调整任务分配的策略改进方法。我们的实证评估表明，LIA模块可以无缝集成到各种基于集中训练和分散执行（CTDE）的多智能体强化学习（MARL）方法中，显著提高了它们的性能。此外，通过将LIA-MADDPG与六种传统的强化学习算法和一种启发式算法进行比较，我们证明了它具有优越的可扩展性，对环境变化的快速适应能力，以及保持稳定性和收敛速度的能力。这些结果强调了LIA-MADDPG的卓越性能及其通过增强本地协作和自适应策略执行显著改善机器人群体动态任务分配的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Local Information Aggregation-Based Multiagent Reinforcement Learning for Robot Swarm Dynamic Task Allocation

In this article, we explore how to optimize task allocation for robot swarms in dynamic environments, emphasizing the necessity of formulating robust, flexible, and scalable strategies for robot cooperation. We introduce a novel framework using a decentralized partially observable Markov decision process (Dec-POMDP), specifically designed for distributed robot swarm networks. At the core of our methodology is the local information aggregation multiagent deep deterministic policy gradient (LIA-MADDPG) algorithm, which merges centralized training with distributed execution. During the centralized training phase, a local information aggregation (LIA) module is meticulously designed to gather critical data from neighboring robots, enhancing decision-making efficiency. In the distributed execution phase, a strategy improvement method is proposed to dynamically adjust task allocation based on changing and partially observable environmental conditions. Our empirical evaluations show that the LIA module can be seamlessly integrated into various centralized training and decentralized execution (CTDE)-based multiagent reinforcement learning (MARL) methods, significantly enhancing their performance. Additionally, by comparing LIA-MADDPG with six conventional reinforcement learning algorithms and a heuristic algorithm, we demonstrate its superior scalability, rapid adaptation to environmental changes, and ability to maintain both stability and convergence speed. These results underscore LIA-MADDPG’s outstanding performance and its potential to significantly improve dynamic task allocation in robot swarms through enhanced local collaboration and adaptive strategy execution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.