HQA: Hybrid Q-learning and AODV multi-path routing algorithm for Flying Ad-hoc Networks

IF 5.8 2区计算机科学 Q1 TELECOMMUNICATIONS

Vehicular Communications Pub Date : 2025-06-23 DOI:10.1016/j.vehcom.2025.100947

Chen Sun, Liang Hou, Suqi Yu, Jian Shu

{"title":"HQA: Hybrid Q-learning and AODV multi-path routing algorithm for Flying Ad-hoc Networks","authors":"Chen Sun, Liang Hou, Suqi Yu, Jian Shu","doi":"10.1016/j.vehcom.2025.100947","DOIUrl":null,"url":null,"abstract":"<div><div>Reliable and efficient data transmission between Unmanned Aerial Vehicle (UAV) nodes is critical for the control of UAV swarms and relies heavily on effective routing protocols in Flying Ad-hoc Networks (FANETs). However, Q-learning-based FANET routing protocols, which are gaining widespread attention, face two significant challenges: 1) insufficient stability of Q-learning leads to unreliable route selection in certain scenarios and higher packet loss rates; 2) in void regions with frequent topology changes and vast path exploration spaces, the slow convergence of Q-learning fails to adapt quickly to dynamic environmental changes, thereby reducing the packet delivery rate (PDR). This paper proposes a hybrid Q-learning/AODV (HQA) multi-path routing algorithm that integrates Q-learning and the AODV protocols to address these challenges. HQA includes a Bayesian stability evaluator for adaptive Q-learning/AODV switching and a dual-update reward mechanism that integrates reliable AODV paths into Q-learning training, enabling rapid void recovery and latency-optimized routing. Experimental results demonstrate HQA's superiority over baseline protocols: Compared to AODV, HQA reduces average end-to-end delay by 13.6–23.9% and improves PDR by 5.4–9.1% in non-void and void states, respectively. It outperforms QMR by 2.2–6.3% in PDR while achieving 25.6% and 53.2% higher average PDR than QMR and AODV across network densities. The hybrid design accelerates convergence by 40% versus standalone Q-learning through AODV-assisted rewards, maintaining scalability under dynamic topology changes. These findings indicate that the HQA algorithm can more rapidly adapt to the rapid changes in FANETs and better handle void regions, offering a promising solution for enhancing the performance and reliability of FANETs.</div></div>","PeriodicalId":54346,"journal":{"name":"Vehicular Communications","volume":"55 ","pages":"Article 100947"},"PeriodicalIF":5.8000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vehicular Communications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214209625000749","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Reliable and efficient data transmission between Unmanned Aerial Vehicle (UAV) nodes is critical for the control of UAV swarms and relies heavily on effective routing protocols in Flying Ad-hoc Networks (FANETs). However, Q-learning-based FANET routing protocols, which are gaining widespread attention, face two significant challenges: 1) insufficient stability of Q-learning leads to unreliable route selection in certain scenarios and higher packet loss rates; 2) in void regions with frequent topology changes and vast path exploration spaces, the slow convergence of Q-learning fails to adapt quickly to dynamic environmental changes, thereby reducing the packet delivery rate (PDR). This paper proposes a hybrid Q-learning/AODV (HQA) multi-path routing algorithm that integrates Q-learning and the AODV protocols to address these challenges. HQA includes a Bayesian stability evaluator for adaptive Q-learning/AODV switching and a dual-update reward mechanism that integrates reliable AODV paths into Q-learning training, enabling rapid void recovery and latency-optimized routing. Experimental results demonstrate HQA's superiority over baseline protocols: Compared to AODV, HQA reduces average end-to-end delay by 13.6–23.9% and improves PDR by 5.4–9.1% in non-void and void states, respectively. It outperforms QMR by 2.2–6.3% in PDR while achieving 25.6% and 53.2% higher average PDR than QMR and AODV across network densities. The hybrid design accelerates convergence by 40% versus standalone Q-learning through AODV-assisted rewards, maintaining scalability under dynamic topology changes. These findings indicate that the HQA algorithm can more rapidly adapt to the rapid changes in FANETs and better handle void regions, offering a promising solution for enhancing the performance and reliability of FANETs.

查看原文本刊更多论文

飞行自组织网络的混合q -学习和AODV多路径路由算法

无人机节点之间可靠、高效的数据传输对于无人机群的控制至关重要，并且在很大程度上依赖于飞行自组织网络（fanet）中有效的路由协议。然而，基于q -学习的FANET路由协议正受到广泛关注，面临着两大挑战：1)q -学习的稳定性不足，导致某些场景下路由选择不可靠，丢包率较高；2)在拓扑变化频繁、路径探索空间广阔的空洞区域，q -学习的收敛速度较慢，不能快速适应动态环境变化，从而降低了分组投递率（PDR）。本文提出了一种混合q -学习/AODV （HQA）多路径路由算法，该算法集成了q -学习和AODV协议来解决这些挑战。HQA包括一个用于自适应q -学习/AODV切换的贝叶斯稳定性评估器和一个双更新奖励机制，该机制将可靠的AODV路径集成到q -学习训练中，从而实现快速的空隙恢复和延迟优化路由。实验结果表明HQA优于基线协议：与AODV相比，HQA在非空和空状态下分别将端到端平均延迟降低13.6-23.9%，将PDR提高5.4-9.1%。在PDR方面，它比QMR高出2.2-6.3%，而在网络密度上，它的平均PDR比QMR和AODV分别高出25.6%和53.2%。与通过aodv辅助奖励的独立Q-learning相比，混合设计的收敛速度提高了40%，在动态拓扑变化下保持了可扩展性。这些结果表明，HQA算法能够更快地适应fanet的快速变化，更好地处理空洞区域，为提高fanet的性能和可靠性提供了一种有前途的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Vehicular Communications Engineering-Electrical and Electronic Engineering

CiteScore

12.70

自引率

10.40%

发文量

审稿时长

62 days

期刊介绍： Vehicular communications is a growing area of communications between vehicles and including roadside communication infrastructure. Advances in wireless communications are making possible sharing of information through real time communications between vehicles and infrastructure. This has led to applications to increase safety of vehicles and communication between passengers and the Internet. Standardization efforts on vehicular communication are also underway to make vehicular transportation safer, greener and easier. The aim of the journal is to publish high quality peer–reviewed papers in the area of vehicular communications. The scope encompasses all types of communications involving vehicles, including vehicle–to–vehicle and vehicle–to–infrastructure. The scope includes (but not limited to) the following topics related to vehicular communications: Vehicle to vehicle and vehicle to infrastructure communications Channel modelling, modulating and coding Congestion Control and scalability issues Protocol design, testing and verification Routing in vehicular networks Security issues and countermeasures Deployment and field testing Reducing energy consumption and enhancing safety of vehicles Wireless in–car networks Data collection and dissemination methods Mobility and handover issues Safety and driver assistance applications UAV Underwater communications Autonomous cooperative driving Social networks Internet of vehicles Standardization of protocols.