A Novel On-Policy DRL-Based Approach for Resource Allocation in Hybrid RF/VLC Systems

IF 4.3 2区 计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Tanya Verma;Arif Raza;Shivanshu Shrivastava;Amit Kumar;Dwarkadas Prahladadas Kothari;Umakant Dhar Dwivedi
{"title":"A Novel On-Policy DRL-Based Approach for Resource Allocation in Hybrid RF/VLC Systems","authors":"Tanya Verma;Arif Raza;Shivanshu Shrivastava;Amit Kumar;Dwarkadas Prahladadas Kothari;Umakant Dhar Dwivedi","doi":"10.1109/TCE.2025.3529846","DOIUrl":null,"url":null,"abstract":"Visible light communication (VLC) has emerged as a promising technology, delivering high-speed data transmission for 5G and beyond communication. Nevertheless, its susceptibility to blockages demands a co-deployment with traditional radio frequency (RF) systems to ensure uninterrupted connectivity. This co-deployment, known as a hybrid RF/VLC system, is a subset of heterogeneous networks (HetNets) and offers interoperability, energy efficiency, and optimal resource utilization. In hybrid RF/VLC, efficient resource allocation and load balancing are crucial. Existing Deep Q-Network (DQN) learning-based methods designed to address these issues, fail in large and dynamic environments. Our present study investigates alternative approaches for optimal resource allocation and load balancing in dynamic and large hybrid RF/VLC systems, to achieve maximum data rates for users. We propose two model-free on-policy deep reinforcement learning (DRL) based schemes, namely advantage actor-critic (A2C) and proximal policy optimization (PPO), for efficient resource allocation in hybrid RF/VLC. Simulation results show that the A2C and PPO based schemes outperform the DQN learning scheme by 31.3% and 32.5%, respectively, in terms of data rates. The proposed schemes also outperform the deep deterministic policy gradient (DDPG) in data rate maximization by up to 8.1% and 9.7%, respectively.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 1","pages":"550-560"},"PeriodicalIF":4.3000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10844503/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Visible light communication (VLC) has emerged as a promising technology, delivering high-speed data transmission for 5G and beyond communication. Nevertheless, its susceptibility to blockages demands a co-deployment with traditional radio frequency (RF) systems to ensure uninterrupted connectivity. This co-deployment, known as a hybrid RF/VLC system, is a subset of heterogeneous networks (HetNets) and offers interoperability, energy efficiency, and optimal resource utilization. In hybrid RF/VLC, efficient resource allocation and load balancing are crucial. Existing Deep Q-Network (DQN) learning-based methods designed to address these issues, fail in large and dynamic environments. Our present study investigates alternative approaches for optimal resource allocation and load balancing in dynamic and large hybrid RF/VLC systems, to achieve maximum data rates for users. We propose two model-free on-policy deep reinforcement learning (DRL) based schemes, namely advantage actor-critic (A2C) and proximal policy optimization (PPO), for efficient resource allocation in hybrid RF/VLC. Simulation results show that the A2C and PPO based schemes outperform the DQN learning scheme by 31.3% and 32.5%, respectively, in terms of data rates. The proposed schemes also outperform the deep deterministic policy gradient (DDPG) in data rate maximization by up to 8.1% and 9.7%, respectively.
一种基于策略drl的混合射频/VLC系统资源分配新方法
可见光通信(VLC)已经成为一项有前途的技术,为5G及以后的通信提供高速数据传输。然而,它对阻塞的敏感性要求与传统射频(RF)系统共同部署,以确保不间断连接。这种共同部署被称为混合RF/VLC系统,是异构网络(HetNets)的一个子集,可提供互操作性、能源效率和最佳资源利用。在混合射频/VLC中,有效的资源分配和负载平衡至关重要。现有的基于深度q网络(DQN)学习的方法旨在解决这些问题,但在大型和动态环境中失败。我们目前的研究探讨了动态和大型混合RF/VLC系统中最佳资源分配和负载平衡的替代方法,以实现用户的最大数据速率。我们提出了两种基于无模型的基于策略的深度强化学习(DRL)方案,即优势行为者批评家(A2C)和近端策略优化(PPO),用于混合RF/VLC的有效资源分配。仿真结果表明,基于A2C和PPO的学习方案在数据速率方面分别优于DQN学习方案31.3%和32.5%。在数据速率最大化方面,所提出的方案也比深度确定性策略梯度(DDPG)分别高出8.1%和9.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.70
自引率
9.30%
发文量
59
审稿时长
3.3 months
期刊介绍: The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信