Dynamic Resource Allocation for Real-Time Cloud XR Video Transmission: A Reinforcement Learning Approach

IF 7 1区计算机科学 Q1 TELECOMMUNICATIONS

IEEE Transactions on Cognitive Communications and Networking Pub Date : 2024-01-11 DOI:10.1109/TCCN.2024.3352982

Zhaocheng Wang;Rui Wang;Jun Wu;Wei Zhang;Chenxi Li

{"title":"Dynamic Resource Allocation for Real-Time Cloud XR Video Transmission: A Reinforcement Learning Approach","authors":"Zhaocheng Wang;Rui Wang;Jun Wu;Wei Zhang;Chenxi Li","doi":"10.1109/TCCN.2024.3352982","DOIUrl":null,"url":null,"abstract":"The extend reality (XR) applications are increasing rapidly alongside the development of mobile Internet. Wireless resource allocation faces a significant challenge due to the high reliability and ultra-low latency characteristics of XR applications. So it is crucial to implement a rational resource allocation program. However, the complex characteristics of multi-user channels, coupled with the huge solution space of the resource allocation optimization problem, prevent conventional methods from efficiently and reliably deriving resource block (RB) allocation schemes. Therefore, in this paper, we construct a low-latency, highly dynamic cloud XR video transmission model considering the randomness of video arrival misalignment for different users, and we resort to newly developed deep reinforcement learning (DRL) techniques for solutions. To deal with the dimensional disaster problem with exponential order of RB allocation, we propose a parallel multi-DRL framework as the foundation for introducing two dynamic RB allocation algorithms: multi noisy double dueling deep Q networks (M-Noisy-D3QN) and multi soft actor critic (M-SAC). Both of the proposed algorithms can improve resource utilization and can achieve the exploration ability and complexity trade-off. Moreover, to address the challenge that RB allocation actions and system goals are not directly related, we design a novel reward function combining external rewards and internal incentives to establish a coherent connection between the two, i.e., solve the reward sparsity problem in DRL. Simulation results show that the proposed dynamic RB allocation methods can successfully serve nearly twice as many users as other benchmarks in case of bandwidth resource constraints.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"10 3","pages":"996-1010"},"PeriodicalIF":7.0000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10391056/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

The extend reality (XR) applications are increasing rapidly alongside the development of mobile Internet. Wireless resource allocation faces a significant challenge due to the high reliability and ultra-low latency characteristics of XR applications. So it is crucial to implement a rational resource allocation program. However, the complex characteristics of multi-user channels, coupled with the huge solution space of the resource allocation optimization problem, prevent conventional methods from efficiently and reliably deriving resource block (RB) allocation schemes. Therefore, in this paper, we construct a low-latency, highly dynamic cloud XR video transmission model considering the randomness of video arrival misalignment for different users, and we resort to newly developed deep reinforcement learning (DRL) techniques for solutions. To deal with the dimensional disaster problem with exponential order of RB allocation, we propose a parallel multi-DRL framework as the foundation for introducing two dynamic RB allocation algorithms: multi noisy double dueling deep Q networks (M-Noisy-D3QN) and multi soft actor critic (M-SAC). Both of the proposed algorithms can improve resource utilization and can achieve the exploration ability and complexity trade-off. Moreover, to address the challenge that RB allocation actions and system goals are not directly related, we design a novel reward function combining external rewards and internal incentives to establish a coherent connection between the two, i.e., solve the reward sparsity problem in DRL. Simulation results show that the proposed dynamic RB allocation methods can successfully serve nearly twice as many users as other benchmarks in case of bandwidth resource constraints.

查看原文本刊更多论文

实时云 XR 视频传输的动态资源分配：强化学习方法

随着移动互联网的发展，扩展现实（XR）应用也在迅速增加。由于 XR 应用具有高可靠性和超低延迟的特点，无线资源分配面临着巨大的挑战。因此，实施合理的资源分配方案至关重要。然而，由于多用户信道的复杂特性，加上资源分配优化问题的求解空间巨大，传统方法无法高效可靠地推导出资源块（RB）分配方案。因此，本文考虑到不同用户视频到达错位的随机性，构建了一种低延迟、高动态的云 XR 视频传输模型，并借助新开发的深度强化学习（DRL）技术进行求解。为了应对RB分配指数阶的维度灾难问题，我们提出了并行多DRL框架，并在此基础上引入了两种动态RB分配算法：多噪声双对决深度Q网络（M-Noisy-D3QN）和多软演员批评者（M-SAC）。这两种算法都能提高资源利用率，实现探索能力和复杂性的权衡。此外，针对 RB 分配行为与系统目标并不直接相关的难题，我们设计了一种结合外部奖励和内部激励的新型奖励函数，以建立两者之间的一致性联系，即解决 DRL 中的奖励稀疏性问题。仿真结果表明，在带宽资源受限的情况下，所提出的动态 RB 分配方法可以成功地为近两倍于其他基准的用户提供服务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Cognitive Communications and Networking Computer Science-Artificial Intelligence

CiteScore

15.50

自引率

7.00%

发文量

108

期刊介绍： The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.