Meta-enhanced hierarchical multi-agent reinforcement learning for dynamic spectrum management and trust-based routing in cognitive vehicular networks

IF 4.4 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Ad Hoc Networks Pub Date : 2025-04-24 DOI:10.1016/j.adhoc.2025.103874

Ankur Nahar , Debasis Das , Ramnarayan Yadav , Khujamatov Halimjon , Ernazar Reypnazarov

{"title":"Meta-enhanced hierarchical multi-agent reinforcement learning for dynamic spectrum management and trust-based routing in cognitive vehicular networks","authors":"Ankur Nahar , Debasis Das , Ramnarayan Yadav , Khujamatov Halimjon , Ernazar Reypnazarov","doi":"10.1016/j.adhoc.2025.103874","DOIUrl":null,"url":null,"abstract":"<div><div>This research introduces the Meta-Enhanced Recurrent Multi-Agent Reinforcement Learning (M-RMARL) framework, designed to tackle the challenges of reliable routing and dynamic spectrum management in Cognitive Vehicular Ad Hoc Networks (CR-VANETs). The framework is built on Meta-Agnostic Meta-Learning (MAML), utilizing Meta-Learned Deep Recurrent Q-Networks (DRQNs) to significantly reduce training time, enabling vehicles to quickly identify optimal routes and enhance spectrum sensing with minimal adjustments. M-RMARL also features a dynamic spectrum management system that employs Long Short-Term Memory (LSTM)-based meta-predictive models to forecast future spectrum availability and network conditions. These predictions allow DRQNs to make proactive, intelligent decisions, improving spectrum efficiency. To ensure secure communication, the framework incorporates a Trust-Based Meta-Coordination mechanism, which dynamically evaluates agent trustworthiness and integrates these assessments into the decision-making process. Additionally, the framework leverages a Hierarchical Meta-Agent Coordination architecture, where Roadside Units (RSUs) manage global coordination and meta-learning updates, while vehicle agents implement the derived policies. This structure enhances scalability and resource management, making M-RMARL particularly effective in complex decision-making environments. Extensive simulations demonstrate the framework’s effectiveness, showing improvements of 18% in spectrum utilization, 25% in training convergence, 20% in spectrum prediction accuracy, 30% in training efficiency, and 17% in trust evaluation reliability.</div></div>","PeriodicalId":55555,"journal":{"name":"Ad Hoc Networks","volume":"175 ","pages":"Article 103874"},"PeriodicalIF":4.4000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ad Hoc Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570870525001222","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

This research introduces the Meta-Enhanced Recurrent Multi-Agent Reinforcement Learning (M-RMARL) framework, designed to tackle the challenges of reliable routing and dynamic spectrum management in Cognitive Vehicular Ad Hoc Networks (CR-VANETs). The framework is built on Meta-Agnostic Meta-Learning (MAML), utilizing Meta-Learned Deep Recurrent Q-Networks (DRQNs) to significantly reduce training time, enabling vehicles to quickly identify optimal routes and enhance spectrum sensing with minimal adjustments. M-RMARL also features a dynamic spectrum management system that employs Long Short-Term Memory (LSTM)-based meta-predictive models to forecast future spectrum availability and network conditions. These predictions allow DRQNs to make proactive, intelligent decisions, improving spectrum efficiency. To ensure secure communication, the framework incorporates a Trust-Based Meta-Coordination mechanism, which dynamically evaluates agent trustworthiness and integrates these assessments into the decision-making process. Additionally, the framework leverages a Hierarchical Meta-Agent Coordination architecture, where Roadside Units (RSUs) manage global coordination and meta-learning updates, while vehicle agents implement the derived policies. This structure enhances scalability and resource management, making M-RMARL particularly effective in complex decision-making environments. Extensive simulations demonstrate the framework’s effectiveness, showing improvements of 18% in spectrum utilization, 25% in training convergence, 20% in spectrum prediction accuracy, 30% in training efficiency, and 17% in trust evaluation reliability.

查看原文本刊更多论文

认知车辆网络中动态频谱管理和基于信任路由的元增强分层多智能体强化学习

本研究引入了元增强循环多智能体强化学习（M-RMARL）框架，旨在解决认知车辆自组织网络（CR-VANETs）中可靠路由和动态频谱管理的挑战。该框架建立在元不可知论元学习（MAML）的基础上，利用元学习深度循环q网络（DRQNs）显著减少训练时间，使车辆能够快速识别最佳路线，并通过最小的调整增强频谱感知。M-RMARL还具有动态频谱管理系统，该系统采用基于长短期记忆（LSTM）的元预测模型来预测未来的频谱可用性和网络状况。这些预测使drqn能够做出主动、智能的决策，从而提高频谱效率。为了确保通信安全，该框架引入了基于信任的元协调机制，该机制动态评估代理的可信度，并将这些评估整合到决策过程中。此外，该框架利用了分层元代理协调架构，其中路边单元（rsu）管理全局协调和元学习更新，而车辆代理执行派生策略。这种结构增强了可伸缩性和资源管理，使M-RMARL在复杂的决策环境中特别有效。大量的仿真证明了该框架的有效性，频谱利用率提高了18%，训练收敛性提高了25%，频谱预测精度提高了20%，训练效率提高了30%，信任评估可靠性提高了17%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ad Hoc Networks 工程技术-电信学

CiteScore

10.20

自引率

4.20%

发文量

131

审稿时长

4.8 months

期刊介绍： The Ad Hoc Networks is an international and archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in ad hoc and sensor networking areas. The Ad Hoc Networks considers original, high quality and unpublished contributions addressing all aspects of ad hoc and sensor networks. Specific areas of interest include, but are not limited to: Mobile and Wireless Ad Hoc Networks Sensor Networks Wireless Local and Personal Area Networks Home Networks Ad Hoc Networks of Autonomous Intelligent Systems Novel Architectures for Ad Hoc and Sensor Networks Self-organizing Network Architectures and Protocols Transport Layer Protocols Routing protocols (unicast, multicast, geocast, etc.) Media Access Control Techniques Error Control Schemes Power-Aware, Low-Power and Energy-Efficient Designs Synchronization and Scheduling Issues Mobility Management Mobility-Tolerant Communication Protocols Location Tracking and Location-based Services Resource and Information Management Security and Fault-Tolerance Issues Hardware and Software Platforms, Systems, and Testbeds Experimental and Prototype Results Quality-of-Service Issues Cross-Layer Interactions Scalability Issues Performance Analysis and Simulation of Protocols.