基于异步训练并行探索的移动边缘计算网络任务卸载深度强化学习方法

Junyan Chen, Lei Jin, Rui Yao, Hongmei Zhang
{"title":"基于异步训练并行探索的移动边缘计算网络任务卸载深度强化学习方法","authors":"Junyan Chen, Lei Jin, Rui Yao, Hongmei Zhang","doi":"10.1007/s11036-024-02397-7","DOIUrl":null,"url":null,"abstract":"<p>In mobile edge computing (MEC), randomly offloading tasks to edge servers (ES) can cause wireless devices (WD) to compete for limited bandwidth resources, leading to overall performance degradation. Reinforcement learning can provide suitable strategies for task offloading and resource allocation through exploration and trial-and-error, helping to avoid blind offloading. However, traditional reinforcement learning algorithms suffer from slow convergence and a tendency to get stuck in suboptimal local minima, significantly impacting the energy consumption and data timeliness of edge computing task unloading. To address these issues, we propose Parallel Exploration with Asynchronous Training-based Deep Reinforcement Learning (PEATDRL) algorithm for MEC network offloading decisions. Its objective is to maximize system performance while limiting energy consumption in an MEC environment characterized by time-varying wireless channels and random user task arrivals. Firstly, our model employs two independent DNNs for parallel exploration, each generating different offloading strategies. This parallel exploration enhances environmental adaptability, avoids the limitations of a single DNN, and addresses the issue of agents getting stuck in suboptimal local minima due to the explosion of decision combinations, thereby improving decision performance. Secondly, we set different learning rates for the two DNNs during the training phase and trained them at various intervals. This asynchronous training strategy increases the randomness of decision exploration, prevents the two DNNs from converging to the same suboptimal local solution, and improves convergence efficiency by enhancing sample utilization. Finally, we examine the impact of different parallel levels and training step differences on system performance metrics and explain the parameter choices. Experimental results show that the proposed method provides a viable solution to the performance issues caused by slow convergence and local minima, with PEATDRL improving task queue convergence speed by more than 20% compared to baseline algorithms.</p>","PeriodicalId":501103,"journal":{"name":"Mobile Networks and Applications","volume":"1837 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning Method for Task Offloading in Mobile Edge Computing Networks Based on Parallel Exploration with Asynchronous Training\",\"authors\":\"Junyan Chen, Lei Jin, Rui Yao, Hongmei Zhang\",\"doi\":\"10.1007/s11036-024-02397-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In mobile edge computing (MEC), randomly offloading tasks to edge servers (ES) can cause wireless devices (WD) to compete for limited bandwidth resources, leading to overall performance degradation. Reinforcement learning can provide suitable strategies for task offloading and resource allocation through exploration and trial-and-error, helping to avoid blind offloading. However, traditional reinforcement learning algorithms suffer from slow convergence and a tendency to get stuck in suboptimal local minima, significantly impacting the energy consumption and data timeliness of edge computing task unloading. To address these issues, we propose Parallel Exploration with Asynchronous Training-based Deep Reinforcement Learning (PEATDRL) algorithm for MEC network offloading decisions. Its objective is to maximize system performance while limiting energy consumption in an MEC environment characterized by time-varying wireless channels and random user task arrivals. Firstly, our model employs two independent DNNs for parallel exploration, each generating different offloading strategies. This parallel exploration enhances environmental adaptability, avoids the limitations of a single DNN, and addresses the issue of agents getting stuck in suboptimal local minima due to the explosion of decision combinations, thereby improving decision performance. Secondly, we set different learning rates for the two DNNs during the training phase and trained them at various intervals. This asynchronous training strategy increases the randomness of decision exploration, prevents the two DNNs from converging to the same suboptimal local solution, and improves convergence efficiency by enhancing sample utilization. Finally, we examine the impact of different parallel levels and training step differences on system performance metrics and explain the parameter choices. Experimental results show that the proposed method provides a viable solution to the performance issues caused by slow convergence and local minima, with PEATDRL improving task queue convergence speed by more than 20% compared to baseline algorithms.</p>\",\"PeriodicalId\":501103,\"journal\":{\"name\":\"Mobile Networks and Applications\",\"volume\":\"1837 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mobile Networks and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s11036-024-02397-7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile Networks and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11036-024-02397-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在移动边缘计算(MEC)中,随意将任务卸载到边缘服务器(ES)会导致无线设备(WD)争夺有限的带宽资源,从而导致整体性能下降。强化学习可以通过探索和试错为任务卸载和资源分配提供合适的策略,帮助避免盲目卸载。然而,传统的强化学习算法收敛速度慢,容易陷入次优局部最小值,严重影响边缘计算任务卸载的能耗和数据及时性。为了解决这些问题,我们针对 MEC 网络卸载决策提出了基于异步训练的并行探索深度强化学习(PEATDRL)算法。其目标是在以时变无线信道和随机用户任务到达为特征的 MEC 环境中,最大限度地提高系统性能,同时限制能耗。首先,我们的模型采用两个独立的 DNN 进行并行探索,每个 DNN 生成不同的卸载策略。这种并行探索增强了环境适应性,避免了单一 DNN 的局限性,并解决了因决策组合爆炸而导致代理陷入次优局部最小值的问题,从而提高了决策性能。其次,我们在训练阶段为两个 DNN 设置了不同的学习率,并在不同的时间间隔对它们进行训练。这种异步训练策略增加了决策探索的随机性,防止两个 DNN 收敛到相同的次优局部解,并通过提高样本利用率来提高收敛效率。最后,我们研究了不同并行水平和训练步长差异对系统性能指标的影响,并解释了参数选择。实验结果表明,针对收敛速度慢和局部极小值引起的性能问题,所提出的方法提供了可行的解决方案,与基线算法相比,PEATDRL 将任务队列收敛速度提高了 20% 以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Deep Reinforcement Learning Method for Task Offloading in Mobile Edge Computing Networks Based on Parallel Exploration with Asynchronous Training

Deep Reinforcement Learning Method for Task Offloading in Mobile Edge Computing Networks Based on Parallel Exploration with Asynchronous Training

In mobile edge computing (MEC), randomly offloading tasks to edge servers (ES) can cause wireless devices (WD) to compete for limited bandwidth resources, leading to overall performance degradation. Reinforcement learning can provide suitable strategies for task offloading and resource allocation through exploration and trial-and-error, helping to avoid blind offloading. However, traditional reinforcement learning algorithms suffer from slow convergence and a tendency to get stuck in suboptimal local minima, significantly impacting the energy consumption and data timeliness of edge computing task unloading. To address these issues, we propose Parallel Exploration with Asynchronous Training-based Deep Reinforcement Learning (PEATDRL) algorithm for MEC network offloading decisions. Its objective is to maximize system performance while limiting energy consumption in an MEC environment characterized by time-varying wireless channels and random user task arrivals. Firstly, our model employs two independent DNNs for parallel exploration, each generating different offloading strategies. This parallel exploration enhances environmental adaptability, avoids the limitations of a single DNN, and addresses the issue of agents getting stuck in suboptimal local minima due to the explosion of decision combinations, thereby improving decision performance. Secondly, we set different learning rates for the two DNNs during the training phase and trained them at various intervals. This asynchronous training strategy increases the randomness of decision exploration, prevents the two DNNs from converging to the same suboptimal local solution, and improves convergence efficiency by enhancing sample utilization. Finally, we examine the impact of different parallel levels and training step differences on system performance metrics and explain the parameter choices. Experimental results show that the proposed method provides a viable solution to the performance issues caused by slow convergence and local minima, with PEATDRL improving task queue convergence speed by more than 20% compared to baseline algorithms.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信