基于认知无线电的车联网盲交会跳信道序列生成：基于多智能体双延迟深度确定性策略梯度的方法

IF 4.3 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Computer Communications Pub Date : 2025-09-19 DOI:10.1016/j.comcom.2025.108318

Mehri Asadi Vasfi, Behrouz Shahgholi Ghahfarokhi

{"title":"基于认知无线电的车联网盲交会跳信道序列生成：基于多智能体双延迟深度确定性策略梯度的方法","authors":"Mehri Asadi Vasfi, Behrouz Shahgholi Ghahfarokhi","doi":"10.1016/j.comcom.2025.108318","DOIUrl":null,"url":null,"abstract":"<div><div>Efficient spectrum utilization is a major challenge in highly dynamic vehicular environments due to the scarcity of spectrum resources. Cognitive Radio (CR) has emerged as a solution to improve spectrum utilization by enabling opportunistic access in IoV. In this context, channel-hopping based blind rendezvous offers a practical approach for decentralized spectrum access in CR-enabled IoV (CR-IoV). This paper presents a novel Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3PG)-based strategy for generating channel sequences in channel-hopping-based blind rendezvous. Unlike existing methods that overlook the quality of licensed spectrum, our approach ensures spectrum efficiency and QoS awareness in dynamic channel sequence generation. We formulate the channel sequence selection problem as a multi-objective optimization, aiming to maximize spectrum efficiency and minimize Time-To-Rendezvous (TTR) while meeting stringent latency and reliability requirements for vehicular communications. Each vehicle independently generates a channel-hopping sequence using a learning agent, which considers key channel quality metrics such as availability, reliability, and capacity. The generated sequences are employed in an asynchronous and asymmetric blind rendezvous process, enhancing adaptability to dynamic network conditions. Simulation results demonstrate that the proposed method significantly outperforms existing approaches, including Enhanced Jump-Stay (EJS), Single-radio Sunflower Set (SSS), Zero-type, One-type, and S-type (ZOS), Multi-Agent Q-Learning based Rendezvous (MAQLR), Exponential-weight algorithm for Exploration and Exploitation (Exp3), and Reinforcement Learning-based Channel-Hopping Rendezvous (RLCH) in terms of Expected TTR (ETTR), Maximum TTR (MTTR), delay, capacity, and reliability.</div></div>","PeriodicalId":55224,"journal":{"name":"Computer Communications","volume":"243 ","pages":"Article 108318"},"PeriodicalIF":4.3000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Channel-hopping sequence generation for blind rendezvous in cognitive radio-enabled internet of vehicles: A multi-agent twin delayed deep deterministic policy gradient-based method\",\"authors\":\"Mehri Asadi Vasfi, Behrouz Shahgholi Ghahfarokhi\",\"doi\":\"10.1016/j.comcom.2025.108318\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Efficient spectrum utilization is a major challenge in highly dynamic vehicular environments due to the scarcity of spectrum resources. Cognitive Radio (CR) has emerged as a solution to improve spectrum utilization by enabling opportunistic access in IoV. In this context, channel-hopping based blind rendezvous offers a practical approach for decentralized spectrum access in CR-enabled IoV (CR-IoV). This paper presents a novel Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3PG)-based strategy for generating channel sequences in channel-hopping-based blind rendezvous. Unlike existing methods that overlook the quality of licensed spectrum, our approach ensures spectrum efficiency and QoS awareness in dynamic channel sequence generation. We formulate the channel sequence selection problem as a multi-objective optimization, aiming to maximize spectrum efficiency and minimize Time-To-Rendezvous (TTR) while meeting stringent latency and reliability requirements for vehicular communications. Each vehicle independently generates a channel-hopping sequence using a learning agent, which considers key channel quality metrics such as availability, reliability, and capacity. The generated sequences are employed in an asynchronous and asymmetric blind rendezvous process, enhancing adaptability to dynamic network conditions. Simulation results demonstrate that the proposed method significantly outperforms existing approaches, including Enhanced Jump-Stay (EJS), Single-radio Sunflower Set (SSS), Zero-type, One-type, and S-type (ZOS), Multi-Agent Q-Learning based Rendezvous (MAQLR), Exponential-weight algorithm for Exploration and Exploitation (Exp3), and Reinforcement Learning-based Channel-Hopping Rendezvous (RLCH) in terms of Expected TTR (ETTR), Maximum TTR (MTTR), delay, capacity, and reliability.</div></div>\",\"PeriodicalId\":55224,\"journal\":{\"name\":\"Computer Communications\",\"volume\":\"243 \",\"pages\":\"Article 108318\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0140366425002750\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Communications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0140366425002750","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

由于频谱资源的稀缺性，在高动态车辆环境下，高效利用频谱是一个主要挑战。认知无线电（CR）已成为一种通过实现车联网中的机会接入来提高频谱利用率的解决方案。在这种情况下，基于信道跳频的盲交会为CR-IoV （CR-IoV）中的分散频谱接入提供了一种实用的方法。提出了一种新的基于多智能体双延迟深度确定性策略梯度（MATD3PG）的信道序列生成策略。与忽略许可频谱质量的现有方法不同，我们的方法在动态信道序列生成中保证了频谱效率和QoS感知。我们将信道序列选择问题描述为一个多目标优化问题，旨在最大限度地提高频谱效率和最小化时间到交会（TTR），同时满足车辆通信严格的延迟和可靠性要求。每辆车都使用学习代理独立地生成一个信道跳变序列，学习代理考虑关键的信道质量指标，如可用性、可靠性和容量。生成的序列被用于异步和非对称的盲交会过程，增强了对动态网络条件的适应性。仿真结果表明，该方法在期望TTR （ETTR）、最大TTR （MTTR）、延迟、容量和可靠性方面显著优于现有方法，包括增强型跳-停留（EJS）、单无线电Sunflower Set （SSS）、零型、一型和s型（ZOS）、基于多智能体q -学习的集合（MAQLR）、指数加权探索和开发算法（Exp3）和基于强化学习的信道跳集（RLCH）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Channel-hopping sequence generation for blind rendezvous in cognitive radio-enabled internet of vehicles: A multi-agent twin delayed deep deterministic policy gradient-based method

Efficient spectrum utilization is a major challenge in highly dynamic vehicular environments due to the scarcity of spectrum resources. Cognitive Radio (CR) has emerged as a solution to improve spectrum utilization by enabling opportunistic access in IoV. In this context, channel-hopping based blind rendezvous offers a practical approach for decentralized spectrum access in CR-enabled IoV (CR-IoV). This paper presents a novel Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3PG)-based strategy for generating channel sequences in channel-hopping-based blind rendezvous. Unlike existing methods that overlook the quality of licensed spectrum, our approach ensures spectrum efficiency and QoS awareness in dynamic channel sequence generation. We formulate the channel sequence selection problem as a multi-objective optimization, aiming to maximize spectrum efficiency and minimize Time-To-Rendezvous (TTR) while meeting stringent latency and reliability requirements for vehicular communications. Each vehicle independently generates a channel-hopping sequence using a learning agent, which considers key channel quality metrics such as availability, reliability, and capacity. The generated sequences are employed in an asynchronous and asymmetric blind rendezvous process, enhancing adaptability to dynamic network conditions. Simulation results demonstrate that the proposed method significantly outperforms existing approaches, including Enhanced Jump-Stay (EJS), Single-radio Sunflower Set (SSS), Zero-type, One-type, and S-type (ZOS), Multi-Agent Q-Learning based Rendezvous (MAQLR), Exponential-weight algorithm for Exploration and Exploitation (Exp3), and Reinforcement Learning-based Channel-Hopping Rendezvous (RLCH) in terms of Expected TTR (ETTR), Maximum TTR (MTTR), delay, capacity, and reliability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Communications 工程技术-电信学

CiteScore

14.10

自引率

5.00%

发文量

397

审稿时长

66 days

期刊介绍： Computer and Communications networks are key infrastructures of the information society with high socio-economic value as they contribute to the correct operations of many critical services (from healthcare to finance and transportation). Internet is the core of today''s computer-communication infrastructures. This has transformed the Internet, from a robust network for data transfer between computers, to a global, content-rich, communication and information system where contents are increasingly generated by the users, and distributed according to human social relations. Next-generation network technologies, architectures and protocols are therefore required to overcome the limitations of the legacy Internet and add new capabilities and services. The future Internet should be ubiquitous, secure, resilient, and closer to human communication paradigms. Computer Communications is a peer-reviewed international journal that publishes high-quality scientific articles (both theory and practice) and survey papers covering all aspects of future computer communication networks (on all layers, except the physical layer), with a special attention to the evolution of the Internet architecture, protocols, services, and applications.