A 3D Spatial Information Compression Based Deep Reinforcement Learning Technique for UAV Path Planning in Cluttered Environments

IF 4.8 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Open Journal of Vehicular Technology Pub Date : 2025-02-10 DOI:10.1109/OJVT.2025.3540174

Zhipeng Wang;Soon Xin Ng;Mohammed El-Hajjar

{"title":"A 3D Spatial Information Compression Based Deep Reinforcement Learning Technique for UAV Path Planning in Cluttered Environments","authors":"Zhipeng Wang;Soon Xin Ng;Mohammed El-Hajjar","doi":"10.1109/OJVT.2025.3540174","DOIUrl":null,"url":null,"abstract":"Unmanned aerial vehicles (UAVs) can be considered in many applications, such as wireless communication, logistics transportation, agriculture and disaster prevention. The flexible maneuverability of UAVs also means that the UAV often operates in complex 3D environments, which requires efficient and reliable path planning system support. However, as a limited resource platform, the UAV systems cannot support highly complex path planning algorithms in lots of scenarios. In this paper, we propose a 3D spatial information compression (3DSIC) based deep reinforcement learning (DRL) algorithm for UAV path planning in cluttered 3D environments. Specifically, the proposed algorithm compresses the 3D spatial information to 2D through 3DSIC, and then combines the compressed 2D environment information with the current UAV layer spatial information to train UAV agents for path planning using neural networks. Additionally, the proposed 3DSIC is a plug and use module that can be combined with various DRL frameworks such as Deep Q-Network (DQN) and deep deterministic policy gradient (DDPG). Our simulation results show that the training efficiency of 3DSIC-DQN is 4.028 times higher than that directly implementing DQN in a <inline-formula><tex-math>$100 \\times 100 \\times 50$</tex-math></inline-formula> 3D cluttered environment. Furthermore, the training efficiency of 3DSIC-DDPG is 3.9 times higher than the traditional DDPG in the same environment. Moreover, 3DSIC combined with fast recurrent stochastic value gradient (FRSVG), which can be considered as the most state-of-the-art DRL algorithm for UAV path planning, exhibits 2.35 times faster training speed compared with the original FRSVG algorithm.","PeriodicalId":34270,"journal":{"name":"IEEE Open Journal of Vehicular Technology","volume":"6 ","pages":"647-661"},"PeriodicalIF":4.8000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10878448","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Vehicular Technology","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10878448/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Unmanned aerial vehicles (UAVs) can be considered in many applications, such as wireless communication, logistics transportation, agriculture and disaster prevention. The flexible maneuverability of UAVs also means that the UAV often operates in complex 3D environments, which requires efficient and reliable path planning system support. However, as a limited resource platform, the UAV systems cannot support highly complex path planning algorithms in lots of scenarios. In this paper, we propose a 3D spatial information compression (3DSIC) based deep reinforcement learning (DRL) algorithm for UAV path planning in cluttered 3D environments. Specifically, the proposed algorithm compresses the 3D spatial information to 2D through 3DSIC, and then combines the compressed 2D environment information with the current UAV layer spatial information to train UAV agents for path planning using neural networks. Additionally, the proposed 3DSIC is a plug and use module that can be combined with various DRL frameworks such as Deep Q-Network (DQN) and deep deterministic policy gradient (DDPG). Our simulation results show that the training efficiency of 3DSIC-DQN is 4.028 times higher than that directly implementing DQN in a

$100 \times 100 \times 50$

3D cluttered environment. Furthermore, the training efficiency of 3DSIC-DDPG is 3.9 times higher than the traditional DDPG in the same environment. Moreover, 3DSIC combined with fast recurrent stochastic value gradient (FRSVG), which can be considered as the most state-of-the-art DRL algorithm for UAV path planning, exhibits 2.35 times faster training speed compared with the original FRSVG algorithm.

查看原文本刊更多论文

基于三维空间信息压缩的深度强化学习技术在混乱环境下的无人机路径规划

无人机可以应用于无线通信、物流运输、农业和防灾等诸多领域。无人机灵活的机动性也意味着无人机经常在复杂的三维环境中运行，这就需要高效可靠的路径规划系统支持。然而，无人机系统作为一个资源有限的平台，在许多场景下无法支持高度复杂的路径规划算法。本文提出了一种基于三维空间信息压缩（3DSIC）的深度强化学习（DRL）算法，用于复杂三维环境下的无人机路径规划。具体而言，该算法通过3DSIC将三维空间信息压缩为二维，然后将压缩后的二维环境信息与当前无人机层空间信息相结合，利用神经网络训练无人机agent进行路径规划。此外，所提出的3DSIC是一个即插即用模块，可以与各种DRL框架（如深度Q-Network （DQN）和深度确定性策略梯度（DDPG））相结合。仿真结果表明，3DSIC-DQN的训练效率比直接实现DQN在$100 \ × 100 \ × 50$ 3D杂乱环境下的训练效率高4.028倍。在相同环境下，3DSIC-DDPG的训练效率是传统DDPG的3.9倍。此外，3DSIC与快速递归随机值梯度（FRSVG）相结合的训练速度比原FRSVG算法提高了2.35倍，可以认为是目前最先进的无人机路径规划DRL算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊