UAV path planning based on the improved PPO algorithm

Chenyang Qi, Chengfu Wu, Lei Lei, Xiaolu Li, Peiyan Cong
{"title":"UAV path planning based on the improved PPO algorithm","authors":"Chenyang Qi, Chengfu Wu, Lei Lei, Xiaolu Li, Peiyan Cong","doi":"10.1109/ARACE56528.2022.00040","DOIUrl":null,"url":null,"abstract":"In this paper, we consider the problem of unmanned aerial vehicle (UAV) path planning. The traditional path planning algorithm has the problems of low efficiency and poor adaptability, so this paper uses the reinforcement learning algorithm to complete the path planning. The classic proximal policy optimization (PPO) algorithm has problems that the samples with large rewards in the experience replay buffer will seriously affect training, this situation causes the agent’s exploration performance degradation and the algorithm has poor convergence in some path planning tasks. To solve these problems, this paper proposes a frequency decomposition-PPO algorithm (FD-PPO) based on the frequency decomposition and designs a heuristic reward function to solve the UAV path planning problem. The FD-PPO algorithm decomposes rewards into multi-dimensional frequency rewards, then calculate the frequency return to efficiently guide UAV to complete the path planning task. The simulation results show that the FD-PPO algorithm proposed in this paper can adapt to the complex environment, and has outstanding stability under the continuous state space and continuous action space. At the same time, the FD-PPO algorithm has better performance in path planning than the PPO algorithm.","PeriodicalId":437892,"journal":{"name":"2022 Asia Conference on Advanced Robotics, Automation, and Control Engineering (ARACE)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia Conference on Advanced Robotics, Automation, and Control Engineering (ARACE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARACE56528.2022.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In this paper, we consider the problem of unmanned aerial vehicle (UAV) path planning. The traditional path planning algorithm has the problems of low efficiency and poor adaptability, so this paper uses the reinforcement learning algorithm to complete the path planning. The classic proximal policy optimization (PPO) algorithm has problems that the samples with large rewards in the experience replay buffer will seriously affect training, this situation causes the agent’s exploration performance degradation and the algorithm has poor convergence in some path planning tasks. To solve these problems, this paper proposes a frequency decomposition-PPO algorithm (FD-PPO) based on the frequency decomposition and designs a heuristic reward function to solve the UAV path planning problem. The FD-PPO algorithm decomposes rewards into multi-dimensional frequency rewards, then calculate the frequency return to efficiently guide UAV to complete the path planning task. The simulation results show that the FD-PPO algorithm proposed in this paper can adapt to the complex environment, and has outstanding stability under the continuous state space and continuous action space. At the same time, the FD-PPO algorithm has better performance in path planning than the PPO algorithm.
基于改进PPO算法的无人机路径规划
本文研究了无人机(UAV)的路径规划问题。传统的路径规划算法存在效率低、适应性差的问题,因此本文采用强化学习算法来完成路径规划。经典的近端策略优化(PPO)算法存在经验回放缓冲区中奖励较大的样本会严重影响训练的问题,这种情况会导致智能体的探索性能下降,并且算法在一些路径规划任务中收敛性较差。针对这些问题,本文提出了一种基于频率分解的频率分解- ppo算法(FD-PPO),并设计了启发式奖励函数来解决无人机路径规划问题。FD-PPO算法将奖励分解为多维频率奖励,然后计算频率回报,有效引导无人机完成路径规划任务。仿真结果表明,本文提出的FD-PPO算法能够适应复杂环境,并在连续状态空间和连续动作空间下具有出色的稳定性。同时,FD-PPO算法在路径规划方面比PPO算法具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信