A 3D Spatial Information Compression Based Deep Reinforcement Learning Technique for UAV Path Planning in Cluttered Environments

IF 5.3 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Zhipeng Wang;Soon Xin Ng;Mohammed El-Hajjar
{"title":"A 3D Spatial Information Compression Based Deep Reinforcement Learning Technique for UAV Path Planning in Cluttered Environments","authors":"Zhipeng Wang;Soon Xin Ng;Mohammed El-Hajjar","doi":"10.1109/OJVT.2025.3540174","DOIUrl":null,"url":null,"abstract":"Unmanned aerial vehicles (UAVs) can be considered in many applications, such as wireless communication, logistics transportation, agriculture and disaster prevention. The flexible maneuverability of UAVs also means that the UAV often operates in complex 3D environments, which requires efficient and reliable path planning system support. However, as a limited resource platform, the UAV systems cannot support highly complex path planning algorithms in lots of scenarios. In this paper, we propose a 3D spatial information compression (3DSIC) based deep reinforcement learning (DRL) algorithm for UAV path planning in cluttered 3D environments. Specifically, the proposed algorithm compresses the 3D spatial information to 2D through 3DSIC, and then combines the compressed 2D environment information with the current UAV layer spatial information to train UAV agents for path planning using neural networks. Additionally, the proposed 3DSIC is a plug and use module that can be combined with various DRL frameworks such as Deep Q-Network (DQN) and deep deterministic policy gradient (DDPG). Our simulation results show that the training efficiency of 3DSIC-DQN is 4.028 times higher than that directly implementing DQN in a <inline-formula><tex-math>$100 \\times 100 \\times 50$</tex-math></inline-formula> 3D cluttered environment. Furthermore, the training efficiency of 3DSIC-DDPG is 3.9 times higher than the traditional DDPG in the same environment. Moreover, 3DSIC combined with fast recurrent stochastic value gradient (FRSVG), which can be considered as the most state-of-the-art DRL algorithm for UAV path planning, exhibits 2.35 times faster training speed compared with the original FRSVG algorithm.","PeriodicalId":34270,"journal":{"name":"IEEE Open Journal of Vehicular Technology","volume":"6 ","pages":"647-661"},"PeriodicalIF":5.3000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10878448","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Vehicular Technology","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10878448/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Unmanned aerial vehicles (UAVs) can be considered in many applications, such as wireless communication, logistics transportation, agriculture and disaster prevention. The flexible maneuverability of UAVs also means that the UAV often operates in complex 3D environments, which requires efficient and reliable path planning system support. However, as a limited resource platform, the UAV systems cannot support highly complex path planning algorithms in lots of scenarios. In this paper, we propose a 3D spatial information compression (3DSIC) based deep reinforcement learning (DRL) algorithm for UAV path planning in cluttered 3D environments. Specifically, the proposed algorithm compresses the 3D spatial information to 2D through 3DSIC, and then combines the compressed 2D environment information with the current UAV layer spatial information to train UAV agents for path planning using neural networks. Additionally, the proposed 3DSIC is a plug and use module that can be combined with various DRL frameworks such as Deep Q-Network (DQN) and deep deterministic policy gradient (DDPG). Our simulation results show that the training efficiency of 3DSIC-DQN is 4.028 times higher than that directly implementing DQN in a $100 \times 100 \times 50$ 3D cluttered environment. Furthermore, the training efficiency of 3DSIC-DDPG is 3.9 times higher than the traditional DDPG in the same environment. Moreover, 3DSIC combined with fast recurrent stochastic value gradient (FRSVG), which can be considered as the most state-of-the-art DRL algorithm for UAV path planning, exhibits 2.35 times faster training speed compared with the original FRSVG algorithm.
基于三维空间信息压缩的深度强化学习技术在混乱环境下的无人机路径规划
无人机可以应用于无线通信、物流运输、农业和防灾等诸多领域。无人机灵活的机动性也意味着无人机经常在复杂的三维环境中运行,这就需要高效可靠的路径规划系统支持。然而,无人机系统作为一个资源有限的平台,在许多场景下无法支持高度复杂的路径规划算法。本文提出了一种基于三维空间信息压缩(3DSIC)的深度强化学习(DRL)算法,用于复杂三维环境下的无人机路径规划。具体而言,该算法通过3DSIC将三维空间信息压缩为二维,然后将压缩后的二维环境信息与当前无人机层空间信息相结合,利用神经网络训练无人机agent进行路径规划。此外,所提出的3DSIC是一个即插即用模块,可以与各种DRL框架(如深度Q-Network (DQN)和深度确定性策略梯度(DDPG))相结合。仿真结果表明,3DSIC-DQN的训练效率比直接实现DQN在$100 \ × 100 \ × 50$ 3D杂乱环境下的训练效率高4.028倍。在相同环境下,3DSIC-DDPG的训练效率是传统DDPG的3.9倍。此外,3DSIC与快速递归随机值梯度(FRSVG)相结合的训练速度比原FRSVG算法提高了2.35倍,可以认为是目前最先进的无人机路径规划DRL算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
9.60
自引率
0.00%
发文量
25
审稿时长
10 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信