基于策略梯度的MEC任务卸载动态资源优化方案

2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC) Pub Date : 2022-03-04 DOI:10.1109/ITOEC53115.2022.9734566

Yiquan Li, Chenxi Yang, Miaoxin Deng, Xue Tang, Wenzao Li

{"title":"基于策略梯度的MEC任务卸载动态资源优化方案","authors":"Yiquan Li, Chenxi Yang, Miaoxin Deng, Xue Tang, Wenzao Li","doi":"10.1109/ITOEC53115.2022.9734566","DOIUrl":null,"url":null,"abstract":"As an effective tool, reinforcement learning (RL) has attracted much attention in the field of mobile edge computing (MEC). For MEC task offloading, we hope to find a high-quality task offloading strategy faster. The Policy Gradient (PG) algorithm, as one of the RL algorithms, is known for its fast convergence. And the PG algorithm does not need to consider the state transition, and can run directly to get the result. In a queuing task scenario of a single terminal and a single edge server, the PG algorithm can quickly obtain a high-quality offloading scheme. The Greedy algorithm is also a commonly used decision-making method in MEC task offloading. Thus, we use the Greedy algorithm as the experimental control group and compare it with the exhaustive algorithm. Through the simulation platform, it can be judged that on the basis of randomly taking the initial value, the PG algorithm can save more than 50% of the overhead. Although the Greedy algorithm has advantages when the number of tasks is small, as the number of tasks increases, the overhead of the Greedy algorithm will become higher and higher due to the long decision time. Therefore, the PG algorithm is more effective in our scenario, which can obtain a high-quality offloading scheme through a shorter decision time. complexity.","PeriodicalId":127300,"journal":{"name":"2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Dynamic Resource Optimization Scheme for MEC Task Offloading Based on Policy Gradient\",\"authors\":\"Yiquan Li, Chenxi Yang, Miaoxin Deng, Xue Tang, Wenzao Li\",\"doi\":\"10.1109/ITOEC53115.2022.9734566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As an effective tool, reinforcement learning (RL) has attracted much attention in the field of mobile edge computing (MEC). For MEC task offloading, we hope to find a high-quality task offloading strategy faster. The Policy Gradient (PG) algorithm, as one of the RL algorithms, is known for its fast convergence. And the PG algorithm does not need to consider the state transition, and can run directly to get the result. In a queuing task scenario of a single terminal and a single edge server, the PG algorithm can quickly obtain a high-quality offloading scheme. The Greedy algorithm is also a commonly used decision-making method in MEC task offloading. Thus, we use the Greedy algorithm as the experimental control group and compare it with the exhaustive algorithm. Through the simulation platform, it can be judged that on the basis of randomly taking the initial value, the PG algorithm can save more than 50% of the overhead. Although the Greedy algorithm has advantages when the number of tasks is small, as the number of tasks increases, the overhead of the Greedy algorithm will become higher and higher due to the long decision time. Therefore, the PG algorithm is more effective in our scenario, which can obtain a high-quality offloading scheme through a shorter decision time. complexity.\",\"PeriodicalId\":127300,\"journal\":{\"name\":\"2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC)\",\"volume\":\"128 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITOEC53115.2022.9734566\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITOEC53115.2022.9734566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

强化学习作为一种有效的工具，在移动边缘计算(MEC)领域备受关注。对于MEC任务卸载，我们希望能够更快地找到高质量的任务卸载策略。策略梯度(PG)算法作为强化学习算法的一种，具有快速收敛的特点。而PG算法不需要考虑状态转移，可以直接运行得到结果。在单终端、单边缘服务器的排队任务场景下，PG算法可以快速获得高质量的卸载方案。贪心算法也是MEC任务卸载中常用的决策方法。因此，我们将贪心算法作为实验对照组，并将其与穷举算法进行比较。通过仿真平台可以判断，在随机取初值的基础上，PG算法可以节省50%以上的开销。虽然贪心算法在任务数量较少时具有优势，但随着任务数量的增加，由于决策时间长，贪心算法的开销会越来越大。因此，PG算法在我们的场景中更有效，可以通过更短的决策时间获得高质量的卸载方案。的复杂性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Dynamic Resource Optimization Scheme for MEC Task Offloading Based on Policy Gradient

As an effective tool, reinforcement learning (RL) has attracted much attention in the field of mobile edge computing (MEC). For MEC task offloading, we hope to find a high-quality task offloading strategy faster. The Policy Gradient (PG) algorithm, as one of the RL algorithms, is known for its fast convergence. And the PG algorithm does not need to consider the state transition, and can run directly to get the result. In a queuing task scenario of a single terminal and a single edge server, the PG algorithm can quickly obtain a high-quality offloading scheme. The Greedy algorithm is also a commonly used decision-making method in MEC task offloading. Thus, we use the Greedy algorithm as the experimental control group and compare it with the exhaustive algorithm. Through the simulation platform, it can be judged that on the basis of randomly taking the initial value, the PG algorithm can save more than 50% of the overhead. Although the Greedy algorithm has advantages when the number of tasks is small, as the number of tasks increases, the overhead of the Greedy algorithm will become higher and higher due to the long decision time. Therefore, the PG algorithm is more effective in our scenario, which can obtain a high-quality offloading scheme through a shorter decision time. complexity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC)

自引率

0.00%

发文量