Multi-Agent Reinforcement Learning for Zero-Shot Coverage Path Planning with Dynamic UAV Networks

IF 5.2 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
José P. Carvalho, A. Pedro Aguiar
{"title":"Multi-Agent Reinforcement Learning for Zero-Shot Coverage Path Planning with Dynamic UAV Networks","authors":"José P. Carvalho,&nbsp;A. Pedro Aguiar","doi":"10.1016/j.robot.2025.105163","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in autonomous systems have enabled the development of intelligent multi-robot systems for dynamic environments. Unmanned Aerial Vehicles play an important role in multi-robot applications such as precision agriculture, search-and-rescue, and wildfire monitoring, all of which rely on solving the coverage path planning problem. While Multi-Agent Coverage Path Planning approaches have shown potential, many existing methods lack the scalability and adaptability needed for diverse and dynamic scenarios. This paper presents a decentralized Multi-Agent Coverage Path Planning framework based on Multi-Agent Reinforcement Learning with parameter sharing and Centralized Training with Decentralized Execution. The framework incorporates a customized Rainbow Deep-Q Network, a size-invariant reward function, and a robustness and safety filter to ensure completeness and reliability in dynamic environments. Our training pipeline combines curriculum learning, domain randomization, and transfer learning, enabling the model to generalize to unseen scenarios. We demonstrate zero-shot generalization on scenarios with significantly larger maps, an increased number of obstacles, and a varying number of agents compared to what is seen during training. Furthermore, the models can also adapt to more structured maps and handle different tasks, such as search-and-rescue, without the need for retraining.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"195 ","pages":"Article 105163"},"PeriodicalIF":5.2000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Autonomous Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092188902500260X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advancements in autonomous systems have enabled the development of intelligent multi-robot systems for dynamic environments. Unmanned Aerial Vehicles play an important role in multi-robot applications such as precision agriculture, search-and-rescue, and wildfire monitoring, all of which rely on solving the coverage path planning problem. While Multi-Agent Coverage Path Planning approaches have shown potential, many existing methods lack the scalability and adaptability needed for diverse and dynamic scenarios. This paper presents a decentralized Multi-Agent Coverage Path Planning framework based on Multi-Agent Reinforcement Learning with parameter sharing and Centralized Training with Decentralized Execution. The framework incorporates a customized Rainbow Deep-Q Network, a size-invariant reward function, and a robustness and safety filter to ensure completeness and reliability in dynamic environments. Our training pipeline combines curriculum learning, domain randomization, and transfer learning, enabling the model to generalize to unseen scenarios. We demonstrate zero-shot generalization on scenarios with significantly larger maps, an increased number of obstacles, and a varying number of agents compared to what is seen during training. Furthermore, the models can also adapt to more structured maps and handle different tasks, such as search-and-rescue, without the need for retraining.
基于多智能体强化学习的动态无人机网络零射击覆盖路径规划
自主系统的最新进展使动态环境下智能多机器人系统的发展成为可能。无人机在精准农业、搜索救援、野火监测等多机器人应用中发挥着重要作用,这些应用都依赖于解决覆盖路径规划问题。虽然多智能体覆盖路径规划方法显示出潜力,但许多现有方法缺乏多样化和动态场景所需的可扩展性和适应性。提出了一种基于参数共享的多智能体强化学习和分散执行的集中训练的分散多智能体覆盖路径规划框架。该框架结合了一个定制的Rainbow Deep-Q网络,一个大小不变的奖励函数,以及一个鲁棒性和安全性过滤器,以确保动态环境中的完整性和可靠性。我们的训练管道结合了课程学习、领域随机化和迁移学习,使模型能够推广到未知的场景。与训练中看到的相比,我们在具有更大的地图、增加的障碍物数量和不同数量的智能体的场景中展示了零射击泛化。此外,这些模型还可以适应更结构化的地图,处理不同的任务,比如搜索和救援,而不需要再培训。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Robotics and Autonomous Systems
Robotics and Autonomous Systems 工程技术-机器人学
CiteScore
9.00
自引率
7.00%
发文量
164
审稿时长
4.5 months
期刊介绍: Robotics and Autonomous Systems will carry articles describing fundamental developments in the field of robotics, with special emphasis on autonomous systems. An important goal of this journal is to extend the state of the art in both symbolic and sensory based robot control and learning in the context of autonomous systems. Robotics and Autonomous Systems will carry articles on the theoretical, computational and experimental aspects of autonomous systems, or modules of such systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信