{"title":"An optimal resource assignment and mode selection for vehicular communication using proximal on-policy scheme","authors":"","doi":"10.1016/j.aej.2024.07.010","DOIUrl":null,"url":null,"abstract":"<div><p>Vehicle-to-everything (V2X) communication is essential in 5G and upcoming networks as it enables seamless interaction between vehicles and infrastructure, ensuring the reliable transmission of critical and time-sensitive data. Challenges like unstable communication in highly mobile vehicular networks, limited channel state information, high transmission overhead, and significant communication costs hinder vehicle-to-vehicle (V2V) communication. To tackle these issues, a unified approach utilizing distributed deep reinforcement learning is proposed to enhance the overall network performance while meeting the quality of service (QoS), latency, and rate requirements. Recognizing the complexity of this NP-hard, non-convex problem, a machine learning framework based on the Markov decision process (MDP) is adopted for a robust strategy. This framework facilitates the formulation of a reward function and the selection of optimal actions with certainty. Furthermore, a spectrum-based allocation framework employing multi-agent deep reinforcement learning (MADRL) is confidently introduced. The deep deterministic policy gradient (DDPG) within this framework enables the exchange of historical data globally during the primary learning phase, effectively removing the need for signal interaction and manual intervention in optimizing system efficiency. The data transmission policy follows an augmented online policy scheme, known as the proximal online policy scheme (POPS), which confidently reduces the computational complexity during the learning process. The complexity is marginally adjusted using the clipping substitute technique with assurance in the learning phase. Simulation results validate that the proposed method outperforms existing decentralized systems in achieving a higher average data transmission rate and ensuring quality of service (QoS) satisfaction confidently.</p></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1110016824007312/pdfft?md5=df782746c6569cf0cb29189a60affe54&pid=1-s2.0-S1110016824007312-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016824007312","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Vehicle-to-everything (V2X) communication is essential in 5G and upcoming networks as it enables seamless interaction between vehicles and infrastructure, ensuring the reliable transmission of critical and time-sensitive data. Challenges like unstable communication in highly mobile vehicular networks, limited channel state information, high transmission overhead, and significant communication costs hinder vehicle-to-vehicle (V2V) communication. To tackle these issues, a unified approach utilizing distributed deep reinforcement learning is proposed to enhance the overall network performance while meeting the quality of service (QoS), latency, and rate requirements. Recognizing the complexity of this NP-hard, non-convex problem, a machine learning framework based on the Markov decision process (MDP) is adopted for a robust strategy. This framework facilitates the formulation of a reward function and the selection of optimal actions with certainty. Furthermore, a spectrum-based allocation framework employing multi-agent deep reinforcement learning (MADRL) is confidently introduced. The deep deterministic policy gradient (DDPG) within this framework enables the exchange of historical data globally during the primary learning phase, effectively removing the need for signal interaction and manual intervention in optimizing system efficiency. The data transmission policy follows an augmented online policy scheme, known as the proximal online policy scheme (POPS), which confidently reduces the computational complexity during the learning process. The complexity is marginally adjusted using the clipping substitute technique with assurance in the learning phase. Simulation results validate that the proposed method outperforms existing decentralized systems in achieving a higher average data transmission rate and ensuring quality of service (QoS) satisfaction confidently.
期刊介绍:
Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification:
• Mechanical, Production, Marine and Textile Engineering
• Electrical Engineering, Computer Science and Nuclear Engineering
• Civil and Architecture Engineering
• Chemical Engineering and Applied Sciences
• Environmental Engineering