Representation Enhancement-Based Proximal Policy Optimization for UAV Path Planning and Obstacle Avoidance

IF 1.1 4区 工程技术 Q3 ENGINEERING, AEROSPACE
Xiangxiang Huang, Wei Wang, Zhaokang Ji, Bin Cheng
{"title":"Representation Enhancement-Based Proximal Policy Optimization for UAV Path Planning and Obstacle Avoidance","authors":"Xiangxiang Huang, Wei Wang, Zhaokang Ji, Bin Cheng","doi":"10.1155/2023/6654130","DOIUrl":null,"url":null,"abstract":"Path planning and obstacle avoidance are pivotal for intelligent unmanned aerial vehicle (UAV) systems in various domains, such as postdisaster rescue, target detection, and wildlife conservation. Currently, reinforcement learning (RL) has become increasingly popular in UAV decision-making. However, the RL approaches confront the challenges of partial observation and large state space when searching for random targets through continuous actions. This paper proposes a representation enhancement-based proximal policy optimization (RE-PPO) framework to address these issues. The representation enhancement (RE) module consists of observation memory improvement (OMI) and dynamic relative position-attitude reshaping (DRPAR). OMI reduces collision under partially observable conditions by separately extracting perception features and state features through an embedding network and feeding the extracted features to a gated recurrent unit (GRU) to enhance observation memory. DRPAR compresses the state space when modeling continuous actions by transforming movement trajectories of different episodes from an absolute coordinate system into different local coordinate systems to utilize similarity. In addition, three step-wise reward functions are formulated to avoid sparsity and facilitate model convergence. We evaluate the proposed method in three 3D scenarios to demonstrate its effectiveness. Compared to other methods, our method achieves a faster convergence during training and demonstrates a higher success rate and a lower rate of timeout and collision during inference. Our method can significantly enhance the autonomy and intelligence of UAV systems under partially observable conditions and provide a reasonable solution for UAV decision-making under uncertainties.","PeriodicalId":13748,"journal":{"name":"International Journal of Aerospace Engineering","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Aerospace Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2023/6654130","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

Abstract

Path planning and obstacle avoidance are pivotal for intelligent unmanned aerial vehicle (UAV) systems in various domains, such as postdisaster rescue, target detection, and wildlife conservation. Currently, reinforcement learning (RL) has become increasingly popular in UAV decision-making. However, the RL approaches confront the challenges of partial observation and large state space when searching for random targets through continuous actions. This paper proposes a representation enhancement-based proximal policy optimization (RE-PPO) framework to address these issues. The representation enhancement (RE) module consists of observation memory improvement (OMI) and dynamic relative position-attitude reshaping (DRPAR). OMI reduces collision under partially observable conditions by separately extracting perception features and state features through an embedding network and feeding the extracted features to a gated recurrent unit (GRU) to enhance observation memory. DRPAR compresses the state space when modeling continuous actions by transforming movement trajectories of different episodes from an absolute coordinate system into different local coordinate systems to utilize similarity. In addition, three step-wise reward functions are formulated to avoid sparsity and facilitate model convergence. We evaluate the proposed method in three 3D scenarios to demonstrate its effectiveness. Compared to other methods, our method achieves a faster convergence during training and demonstrates a higher success rate and a lower rate of timeout and collision during inference. Our method can significantly enhance the autonomy and intelligence of UAV systems under partially observable conditions and provide a reasonable solution for UAV decision-making under uncertainties.
基于表示增强的无人机路径规划与避障近端策略优化
路径规划和避障是智能无人机(UAV)系统在灾后救援、目标探测和野生动物保护等领域的关键。目前,强化学习(RL)在无人机决策中越来越受欢迎。然而,当通过连续的动作搜索随机目标时,强化学习方法面临着局部观察和大状态空间的挑战。本文提出了一种基于表示增强的近端策略优化框架来解决这些问题。表征增强(RE)模块包括观测记忆改进(OMI)和动态相对位置-姿态重塑(DRPAR)。OMI通过嵌入网络分别提取感知特征和状态特征,并将提取的特征馈送到门控循环单元(GRU),以增强观察记忆,从而减少部分可观察条件下的碰撞。DRPAR在对连续动作建模时压缩状态空间,将不同事件的运动轨迹从一个绝对坐标系转换为不同的局部坐标系,利用相似性。此外,为了避免稀疏性,便于模型收敛,还构造了三个分步奖励函数。我们在三个三维场景中评估了所提出的方法,以证明其有效性。与其他方法相比,我们的方法在训练过程中收敛速度更快,在推理过程中成功率更高,超时和碰撞率更低。该方法可以显著提高无人机系统在部分可观测条件下的自主性和智能性,为不确定条件下的无人机决策提供合理的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.70
自引率
7.10%
发文量
195
审稿时长
22 weeks
期刊介绍: International Journal of Aerospace Engineering aims to serve the international aerospace engineering community through dissemination of scientific knowledge on practical engineering and design methodologies pertaining to aircraft and space vehicles. Original unpublished manuscripts are solicited on all areas of aerospace engineering including but not limited to: -Mechanics of materials and structures- Aerodynamics and fluid mechanics- Dynamics and control- Aeroacoustics- Aeroelasticity- Propulsion and combustion- Avionics and systems- Flight simulation and mechanics- Unmanned air vehicles (UAVs). Review articles on any of the above topics are also welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信