Task Offloading and Resource Allocation Based on Reinforcement Learning and Load Balancing in Vehicular Networking

IF 4.3 2区 计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Shujuan Tian;Shuhuan Xiang;Ziqi Zhou;Haipeng Dai;Enze Yu;Qingyong Deng
{"title":"Task Offloading and Resource Allocation Based on Reinforcement Learning and Load Balancing in Vehicular Networking","authors":"Shujuan Tian;Shuhuan Xiang;Ziqi Zhou;Haipeng Dai;Enze Yu;Qingyong Deng","doi":"10.1109/TCE.2025.3542133","DOIUrl":null,"url":null,"abstract":"Due to limited on-board resources and the mobility characteristics of vehicles in a multi-access edge computing (MEC)-based vehicular network, efficient task offloading and resource allocation schemes are essential for achieving low-latency and low-energy consumption applications in the Internet of Vehicles (IoV). The spatial distribution of vehicles, influenced by various factors, leads to significant workload variations across MEC servers. In this paper, we address task offloading and resource allocation as a joint optimization problem and propose a Load-Balancing Deep Deterministic Policy Gradient (LBDDPG) algorithm to achieve optimal results. The joint optimization problem is modeled as a Markov Decision Process (MDP), enabling the LBDDPG algorithm to systematically address the challenges of workload imbalance and resource inefficiency. The algorithm incorporates a load optimization strategy to balance workload distribution across MEC servers, mitigating disparities caused by uneven vehicle distributions. The reward function is designed to account for both energy consumption and delay, ensuring an optimal trade-off between these critical factors. To enhance training efficiency, a noise-based exploration strategy is employed, preventing ineffective exploration during the early stages. Additionally, constraints such as computational capacity and latency thresholds are embedded to ensure the algorithm’s practical applicability. Experimental results demonstrate that the proposed LBDDPG algorithm achieves faster convergence and superior performance in terms of energy consumption and latency compared to other reinforcement learning algorithms.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 1","pages":"2217-2230"},"PeriodicalIF":4.3000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10887320/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Due to limited on-board resources and the mobility characteristics of vehicles in a multi-access edge computing (MEC)-based vehicular network, efficient task offloading and resource allocation schemes are essential for achieving low-latency and low-energy consumption applications in the Internet of Vehicles (IoV). The spatial distribution of vehicles, influenced by various factors, leads to significant workload variations across MEC servers. In this paper, we address task offloading and resource allocation as a joint optimization problem and propose a Load-Balancing Deep Deterministic Policy Gradient (LBDDPG) algorithm to achieve optimal results. The joint optimization problem is modeled as a Markov Decision Process (MDP), enabling the LBDDPG algorithm to systematically address the challenges of workload imbalance and resource inefficiency. The algorithm incorporates a load optimization strategy to balance workload distribution across MEC servers, mitigating disparities caused by uneven vehicle distributions. The reward function is designed to account for both energy consumption and delay, ensuring an optimal trade-off between these critical factors. To enhance training efficiency, a noise-based exploration strategy is employed, preventing ineffective exploration during the early stages. Additionally, constraints such as computational capacity and latency thresholds are embedded to ensure the algorithm’s practical applicability. Experimental results demonstrate that the proposed LBDDPG algorithm achieves faster convergence and superior performance in terms of energy consumption and latency compared to other reinforcement learning algorithms.
基于强化学习和负载均衡的车联网任务卸载与资源分配
在基于多接入边缘计算(MEC)的车辆网络中,由于有限的车载资源和车辆的移动性特点,高效的任务卸载和资源分配方案对于实现低延迟和低能耗的车联网(IoV)应用至关重要。车辆的空间分布受各种因素的影响,导致MEC服务器之间的工作负载存在显著差异。在本文中,我们将任务卸载和资源分配作为一个联合优化问题来解决,并提出了一种负载平衡深度确定性策略梯度(Load-Balancing Deep Deterministic Policy Gradient, LBDDPG)算法来实现最优结果。将联合优化问题建模为马尔可夫决策过程(MDP),使LBDDPG算法能够系统地解决工作负载不平衡和资源效率低下的挑战。该算法采用负载优化策略来平衡MEC服务器之间的工作负载分布,减轻车辆分布不均匀造成的差异。奖励函数的设计考虑了能量消耗和延迟,确保了这两个关键因素之间的最佳权衡。为了提高训练效率,采用了基于噪声的勘探策略,避免了早期的无效勘探。此外,还嵌入了计算能力和延迟阈值等约束,以确保算法的实际适用性。实验结果表明,与其他强化学习算法相比,所提出的LBDDPG算法在能量消耗和延迟方面具有更快的收敛速度和优越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.70
自引率
9.30%
发文量
59
审稿时长
3.3 months
期刊介绍: The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信