Task Offloading and Resource Allocation Based on Reinforcement Learning and Load Balancing in Vehicular Networking

IF 4.3 2区计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Consumer Electronics Pub Date : 2025-02-14 DOI:10.1109/TCE.2025.3542133

Shujuan Tian;Shuhuan Xiang;Ziqi Zhou;Haipeng Dai;Enze Yu;Qingyong Deng

{"title":"Task Offloading and Resource Allocation Based on Reinforcement Learning and Load Balancing in Vehicular Networking","authors":"Shujuan Tian;Shuhuan Xiang;Ziqi Zhou;Haipeng Dai;Enze Yu;Qingyong Deng","doi":"10.1109/TCE.2025.3542133","DOIUrl":null,"url":null,"abstract":"Due to limited on-board resources and the mobility characteristics of vehicles in a multi-access edge computing (MEC)-based vehicular network, efficient task offloading and resource allocation schemes are essential for achieving low-latency and low-energy consumption applications in the Internet of Vehicles (IoV). The spatial distribution of vehicles, influenced by various factors, leads to significant workload variations across MEC servers. In this paper, we address task offloading and resource allocation as a joint optimization problem and propose a Load-Balancing Deep Deterministic Policy Gradient (LBDDPG) algorithm to achieve optimal results. The joint optimization problem is modeled as a Markov Decision Process (MDP), enabling the LBDDPG algorithm to systematically address the challenges of workload imbalance and resource inefficiency. The algorithm incorporates a load optimization strategy to balance workload distribution across MEC servers, mitigating disparities caused by uneven vehicle distributions. The reward function is designed to account for both energy consumption and delay, ensuring an optimal trade-off between these critical factors. To enhance training efficiency, a noise-based exploration strategy is employed, preventing ineffective exploration during the early stages. Additionally, constraints such as computational capacity and latency thresholds are embedded to ensure the algorithm’s practical applicability. Experimental results demonstrate that the proposed LBDDPG algorithm achieves faster convergence and superior performance in terms of energy consumption and latency compared to other reinforcement learning algorithms.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 1","pages":"2217-2230"},"PeriodicalIF":4.3000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10887320/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Due to limited on-board resources and the mobility characteristics of vehicles in a multi-access edge computing (MEC)-based vehicular network, efficient task offloading and resource allocation schemes are essential for achieving low-latency and low-energy consumption applications in the Internet of Vehicles (IoV). The spatial distribution of vehicles, influenced by various factors, leads to significant workload variations across MEC servers. In this paper, we address task offloading and resource allocation as a joint optimization problem and propose a Load-Balancing Deep Deterministic Policy Gradient (LBDDPG) algorithm to achieve optimal results. The joint optimization problem is modeled as a Markov Decision Process (MDP), enabling the LBDDPG algorithm to systematically address the challenges of workload imbalance and resource inefficiency. The algorithm incorporates a load optimization strategy to balance workload distribution across MEC servers, mitigating disparities caused by uneven vehicle distributions. The reward function is designed to account for both energy consumption and delay, ensuring an optimal trade-off between these critical factors. To enhance training efficiency, a noise-based exploration strategy is employed, preventing ineffective exploration during the early stages. Additionally, constraints such as computational capacity and latency thresholds are embedded to ensure the algorithm’s practical applicability. Experimental results demonstrate that the proposed LBDDPG algorithm achieves faster convergence and superior performance in terms of energy consumption and latency compared to other reinforcement learning algorithms.

查看原文本刊更多论文

基于强化学习和负载均衡的车联网任务卸载与资源分配

在基于多接入边缘计算（MEC）的车辆网络中，由于有限的车载资源和车辆的移动性特点，高效的任务卸载和资源分配方案对于实现低延迟和低能耗的车联网（IoV）应用至关重要。车辆的空间分布受各种因素的影响，导致MEC服务器之间的工作负载存在显著差异。在本文中，我们将任务卸载和资源分配作为一个联合优化问题来解决，并提出了一种负载平衡深度确定性策略梯度（Load-Balancing Deep Deterministic Policy Gradient， LBDDPG）算法来实现最优结果。将联合优化问题建模为马尔可夫决策过程（MDP），使LBDDPG算法能够系统地解决工作负载不平衡和资源效率低下的挑战。该算法采用负载优化策略来平衡MEC服务器之间的工作负载分布，减轻车辆分布不均匀造成的差异。奖励函数的设计考虑了能量消耗和延迟，确保了这两个关键因素之间的最佳权衡。为了提高训练效率，采用了基于噪声的勘探策略，避免了早期的无效勘探。此外，还嵌入了计算能力和延迟阈值等约束，以确保算法的实际适用性。实验结果表明，与其他强化学习算法相比，所提出的LBDDPG算法在能量消耗和延迟方面具有更快的收敛速度和优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Consumer Electronics 工程技术-电信学

CiteScore

7.70

自引率

9.30%

发文量

审稿时长

3.3 months

期刊介绍： The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.