无人机群安全保障成群运动的数字孪生深度强化学习

IF 2.5 4区 计算机科学 Q3 TELECOMMUNICATIONS
Zhilin Li, Lei Lei, Gaoqing Shen, Xiaochang Liu, Xiaojiao Liu
{"title":"无人机群安全保障成群运动的数字孪生深度强化学习","authors":"Zhilin Li,&nbsp;Lei Lei,&nbsp;Gaoqing Shen,&nbsp;Xiaochang Liu,&nbsp;Xiaojiao Liu","doi":"10.1002/ett.70011","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Multi-agent deep reinforcement learning (MADRL) has become a typical paradigm for the flocking motion of UAV swarm in dynamic, stochastic environments. However, sim-to-real problems, such as reality gap, training efficiency, and safety issues, restrict the application of MADRL in flocking motion scenarios. To address these problems, we first propose a digital twin (DT)-enabled training framework. With the assistance of high-fidelity digital twin simulation, effective policies can be efficiently trained. Based on the multi-agent proximal policy optimization (MAPPO) algorithm, we then design the learning approach for flocking motion with matching observation space, action space, and reward function. Afterward, we employ a distributed flocking center estimation algorithm based on position consensus. The estimated center is used as a policy input to improve the aggregation behavior. Moreover, we introduce a repulsion scheme, which applies an additional repulsion force to the action to prevent UAVs from colliding with neighbors and obstacles. Simulation results show that our method performs well in maintaining flocking formation and avoiding collisions, and has better decision-making ability in near-realistic environments.</p>\n </div>","PeriodicalId":23282,"journal":{"name":"Transactions on Emerging Telecommunications Technologies","volume":"35 11","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Digital Twin-Enabled Deep Reinforcement Learning for Safety-Guaranteed Flocking Motion of UAV Swarm\",\"authors\":\"Zhilin Li,&nbsp;Lei Lei,&nbsp;Gaoqing Shen,&nbsp;Xiaochang Liu,&nbsp;Xiaojiao Liu\",\"doi\":\"10.1002/ett.70011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Multi-agent deep reinforcement learning (MADRL) has become a typical paradigm for the flocking motion of UAV swarm in dynamic, stochastic environments. However, sim-to-real problems, such as reality gap, training efficiency, and safety issues, restrict the application of MADRL in flocking motion scenarios. To address these problems, we first propose a digital twin (DT)-enabled training framework. With the assistance of high-fidelity digital twin simulation, effective policies can be efficiently trained. Based on the multi-agent proximal policy optimization (MAPPO) algorithm, we then design the learning approach for flocking motion with matching observation space, action space, and reward function. Afterward, we employ a distributed flocking center estimation algorithm based on position consensus. The estimated center is used as a policy input to improve the aggregation behavior. Moreover, we introduce a repulsion scheme, which applies an additional repulsion force to the action to prevent UAVs from colliding with neighbors and obstacles. Simulation results show that our method performs well in maintaining flocking formation and avoiding collisions, and has better decision-making ability in near-realistic environments.</p>\\n </div>\",\"PeriodicalId\":23282,\"journal\":{\"name\":\"Transactions on Emerging Telecommunications Technologies\",\"volume\":\"35 11\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on Emerging Telecommunications Technologies\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ett.70011\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Emerging Telecommunications Technologies","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ett.70011","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

多代理深度强化学习(MADRL)已成为无人机群在动态、随机环境中成群运动的典型范例。然而,模拟到现实的问题,如现实差距、训练效率和安全问题,限制了 MADRL 在成群运动场景中的应用。为了解决这些问题,我们首先提出了一个支持数字孪生(DT)的训练框架。在高保真数字孪生模拟的帮助下,可以高效地训练出有效的策略。在多代理近端策略优化(MAPPO)算法的基础上,我们设计了与观测空间、行动空间和奖励函数相匹配的植群运动学习方法。之后,我们采用了一种基于位置共识的分布式植群中心估计算法。估计出的中心作为策略输入,用于改善聚合行为。此外,我们还引入了一种斥力方案,为行动施加额外的斥力,以防止无人机与邻居和障碍物发生碰撞。仿真结果表明,我们的方法在保持蜂群队形和避免碰撞方面表现良好,在接近真实的环境中具有更好的决策能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Digital Twin-Enabled Deep Reinforcement Learning for Safety-Guaranteed Flocking Motion of UAV Swarm

Digital Twin-Enabled Deep Reinforcement Learning for Safety-Guaranteed Flocking Motion of UAV Swarm

Multi-agent deep reinforcement learning (MADRL) has become a typical paradigm for the flocking motion of UAV swarm in dynamic, stochastic environments. However, sim-to-real problems, such as reality gap, training efficiency, and safety issues, restrict the application of MADRL in flocking motion scenarios. To address these problems, we first propose a digital twin (DT)-enabled training framework. With the assistance of high-fidelity digital twin simulation, effective policies can be efficiently trained. Based on the multi-agent proximal policy optimization (MAPPO) algorithm, we then design the learning approach for flocking motion with matching observation space, action space, and reward function. Afterward, we employ a distributed flocking center estimation algorithm based on position consensus. The estimated center is used as a policy input to improve the aggregation behavior. Moreover, we introduce a repulsion scheme, which applies an additional repulsion force to the action to prevent UAVs from colliding with neighbors and obstacles. Simulation results show that our method performs well in maintaining flocking formation and avoiding collisions, and has better decision-making ability in near-realistic environments.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.90
自引率
13.90%
发文量
249
期刊介绍: ransactions on Emerging Telecommunications Technologies (ETT), formerly known as European Transactions on Telecommunications (ETT), has the following aims: - to attract cutting-edge publications from leading researchers and research groups around the world - to become a highly cited source of timely research findings in emerging fields of telecommunications - to limit revision and publication cycles to a few months and thus significantly increase attractiveness to publish - to become the leading journal for publishing the latest developments in telecommunications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信