Decentralized Task Offloading in Edge Computing: An Offline-to-Online Reinforcement Learning Approach

IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Hongcai Lin;Lei Yang;Hao Guo;Jiannong Cao
{"title":"Decentralized Task Offloading in Edge Computing: An Offline-to-Online Reinforcement Learning Approach","authors":"Hongcai Lin;Lei Yang;Hao Guo;Jiannong Cao","doi":"10.1109/TC.2024.3377912","DOIUrl":null,"url":null,"abstract":"Decentralized task offloading among cooperative edge nodes has been a promising solution to enhance resource utilization and improve users’ Quality of Experience (QoE) in edge computing. However, current decentralized methods, such as heuristics and game theory-based methods, either optimize greedily or depend on rigid assumptions, failing to adapt to the dynamic edge environment. Existing DRL-based approaches train the model in a simulation and then apply it in practical systems. These methods may perform poorly because of the divergence between the practical system and the simulated environment. Other methods that train and deploy the model directly in real-world systems face a cold-start problem, which will reduce the users’ QoE before the model converges. This paper proposes a novel offline-to-online DRL called (O2O-DRL). It uses the heuristic task logs to warm-start the DRL model offline. However, offline and online data have different distributions, so using offline methods for online fine-tuning will ruin the policy learned offline. To avoid this problem, we use on-policy DRL to fine-tune the model and prevent value overestimation. We evaluate O2O-DRL with other approaches in a simulation and a Kubernetes-based testbed. The performance results show that O2O-DRL outperforms other methods and solves the cold-start problem.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 6","pages":"1603-1615"},"PeriodicalIF":3.6000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10473221/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Decentralized task offloading among cooperative edge nodes has been a promising solution to enhance resource utilization and improve users’ Quality of Experience (QoE) in edge computing. However, current decentralized methods, such as heuristics and game theory-based methods, either optimize greedily or depend on rigid assumptions, failing to adapt to the dynamic edge environment. Existing DRL-based approaches train the model in a simulation and then apply it in practical systems. These methods may perform poorly because of the divergence between the practical system and the simulated environment. Other methods that train and deploy the model directly in real-world systems face a cold-start problem, which will reduce the users’ QoE before the model converges. This paper proposes a novel offline-to-online DRL called (O2O-DRL). It uses the heuristic task logs to warm-start the DRL model offline. However, offline and online data have different distributions, so using offline methods for online fine-tuning will ruin the policy learned offline. To avoid this problem, we use on-policy DRL to fine-tune the model and prevent value overestimation. We evaluate O2O-DRL with other approaches in a simulation and a Kubernetes-based testbed. The performance results show that O2O-DRL outperforms other methods and solves the cold-start problem.
边缘计算中的分散任务卸载:离线到在线强化学习方法
在边缘计算中,合作边缘节点之间的分散式任务卸载一直是提高资源利用率和改善用户体验质量(QoE)的可行解决方案。然而,目前的分散式方法(如启发式方法和基于博弈论的方法)要么贪婪地进行优化,要么依赖于僵化的假设,无法适应动态的边缘环境。现有的基于 DRL 的方法在模拟中训练模型,然后将其应用于实际系统。由于实际系统与模拟环境之间存在差异,这些方法可能表现不佳。其他直接在实际系统中训练和部署模型的方法面临冷启动问题,这会在模型收敛之前降低用户的 QoE。本文提出了一种新颖的离线到在线 DRL(O2O-DRL)。它使用启发式任务日志来离线热启动 DRL 模型。然而,离线数据和在线数据具有不同的分布,因此使用离线方法进行在线微调会破坏离线学习的策略。为了避免这个问题,我们使用在线策略 DRL 对模型进行微调,以防止数值被高估。我们在模拟和基于 Kubernetes 的测试平台上对 O2O-DRL 和其他方法进行了评估。性能结果表明,O2O-DRL 的性能优于其他方法,并能解决冷启动问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Computers
IEEE Transactions on Computers 工程技术-工程:电子与电气
CiteScore
6.60
自引率
5.40%
发文量
199
审稿时长
6.0 months
期刊介绍: The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信