An improved elitist-Q-Learning path planning strategy for VTOL air-ground vehicle using convolutional neural network mode prediction

IF 8 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics Pub Date : 2025-04-05 DOI:10.1016/j.aei.2025.103316

Jing Zhao, Chao Yang, Weida Wang, Ying Li, Tianqi Qie, Bin Xu

{"title":"An improved elitist-Q-Learning path planning strategy for VTOL air-ground vehicle using convolutional neural network mode prediction","authors":"Jing Zhao, Chao Yang, Weida Wang, Ying Li, Tianqi Qie, Bin Xu","doi":"10.1016/j.aei.2025.103316","DOIUrl":null,"url":null,"abstract":"<div><div>Vertical take-off and landing (VTOL) air-ground integrated vehicles have received extensive attention in rescue, transportation, and other task fields. To further improve the task efficiency in complex environments such as post-disaster cities and scrubland, this vehicle requires efficient and rational path planning. In above environments, it is difficult to obtain complete and accurate obstacle information. The planning process faces the technical difficulties of using the limited obstacle perception information to switch air-ground modes and fast acquire the optimal planning trajectory with the shortest distance. To address the above issues, this paper proposes an improved elitist-Q-Learning path planning strategy for the VTOL air-ground vehicle using convolutional neural network mode prediction. Firstly, to predict the mode switching actions, a convolutional neural mode prediction network is constructed with local obstacle information as input data. Secondly, based on the above predicted actions, an elitist-Q-Learning (EQL) multi-mode planning algorithm is designed. A new reward function considering the multi-mode actions is proposed. On this basis, heuristic correction and elitist adjusting factors replace the fixed rewards of traditional Q-Learning with dynamically adjusted rewards during the iterative process. The Q table is quickly updated to converge to optimal values. Finally, this proposed strategy is verified in randomly generated maps of 1000 m*1000 m. Results show that the prediction accuracy can be maintained over 93 %. Its path distance is reduced by 4.56 % and 1.75 % compared to that of traditional Q-Learning and A* with mode prediction, respectively. It has the same path distance as BAS-A*, LPA*, and D* Lite with mode prediction. Compared to traditional Q-Learning, it reduces computational time by 36.61 %. When converged, its iterative numbers are 58.9 % less than those of traditional Q-Learning.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":""},"PeriodicalIF":8.0000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625002095","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Vertical take-off and landing (VTOL) air-ground integrated vehicles have received extensive attention in rescue, transportation, and other task fields. To further improve the task efficiency in complex environments such as post-disaster cities and scrubland, this vehicle requires efficient and rational path planning. In above environments, it is difficult to obtain complete and accurate obstacle information. The planning process faces the technical difficulties of using the limited obstacle perception information to switch air-ground modes and fast acquire the optimal planning trajectory with the shortest distance. To address the above issues, this paper proposes an improved elitist-Q-Learning path planning strategy for the VTOL air-ground vehicle using convolutional neural network mode prediction. Firstly, to predict the mode switching actions, a convolutional neural mode prediction network is constructed with local obstacle information as input data. Secondly, based on the above predicted actions, an elitist-Q-Learning (EQL) multi-mode planning algorithm is designed. A new reward function considering the multi-mode actions is proposed. On this basis, heuristic correction and elitist adjusting factors replace the fixed rewards of traditional Q-Learning with dynamically adjusted rewards during the iterative process. The Q table is quickly updated to converge to optimal values. Finally, this proposed strategy is verified in randomly generated maps of 1000 m*1000 m. Results show that the prediction accuracy can be maintained over 93 %. Its path distance is reduced by 4.56 % and 1.75 % compared to that of traditional Q-Learning and A* with mode prediction, respectively. It has the same path distance as BAS-A*, LPA*, and D* Lite with mode prediction. Compared to traditional Q-Learning, it reduces computational time by 36.61 %. When converged, its iterative numbers are 58.9 % less than those of traditional Q-Learning.

查看原文本刊更多论文

基于卷积神经网络模式预测的改进精英- q -学习垂直起降地空飞行器路径规划策略

垂直起降（VTOL）空地一体化车辆在救援、运输等任务领域受到广泛关注。为了进一步提高灾后城市和灌木丛等复杂环境下的任务效率，该车辆需要高效合理的路径规划。在上述环境中，很难获得完整准确的障碍物信息。规划过程面临着利用有限的障碍物感知信息进行空地模式切换，快速获取距离最短的最优规划轨迹的技术难题。针对上述问题，本文提出了一种改进的基于卷积神经网络模式预测的VTOL空地飞行器精英- q -学习路径规划策略。首先，以局部障碍物信息为输入数据，构建卷积神经模型预测网络，预测模式切换动作；其次，基于上述预测行为，设计了一种精英- q -学习（EQL）多模式规划算法。提出了一种考虑多模式动作的奖励函数。在此基础上，启发式修正和精英调整因子在迭代过程中用动态调整的奖励取代了传统Q-Learning的固定奖励。快速更新Q表以收敛到最优值。最后，在1000 m*1000 m的随机生成地图中验证了该策略。结果表明，预测精度可保持在93%以上。与传统的Q-Learning和带模式预测的A*相比，其路径距离分别缩短了4.56%和1.75%。它具有与具有模式预测的BAS-A*， LPA*和D* Lite相同的路径距离。与传统的Q-Learning相比，计算时间减少了36.61%。收敛时，其迭代次数比传统Q-Learning算法减少58.9%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advanced Engineering Informatics 工程技术-工程：综合

CiteScore

12.40

自引率

18.20%

发文量

292

审稿时长

45 days

期刊介绍： Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.