支持noma的移动边缘计算的分布式强化学习

Zhong Yang, Yuanwei Liu, Yue Chen
{"title":"支持noma的移动边缘计算的分布式强化学习","authors":"Zhong Yang, Yuanwei Liu, Yue Chen","doi":"10.1109/ICCWorkshops49005.2020.9145457","DOIUrl":null,"url":null,"abstract":"A novel non-orthogonal multiple access (NOMA) enabled cache-aided mobile edge computing (MEC) framework is proposed, for minimizing the sum energy consumption. The NOMA strategy enables mobile users to offload computation tasks to the access point (AP) simultaneously, which improves the spectrum efficiency. In this article, the considered resource allocation problem is formulated as a long-term reward maximization problem that involves a joint optimization of task offloading decision, computation resource allocation, and caching decision. To tackle this nontrivial problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy from historical experience. Moreover, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal actions in every state. The proposed BLA based action selection scheme is instantaneously self-correcting, consequently, if the probabilities of two computing models (i.e., local computing and offloading computing) are not equal, the optimal action unveils eventually. Extensive simulations demonstrate that: 1) The proposed cache-aided NOMA MEC framework significantly outperforms the other representative benchmark schemes under various network setups. 2) The effectiveness of the proposed BAL-MAQ-learning algorithm is confirmed from the comparison with the results of conventional reinforcement learning algorithms.","PeriodicalId":254869,"journal":{"name":"2020 IEEE International Conference on Communications Workshops (ICC Workshops)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Distributed Reinforcement Learning for NOMA-Enabled Mobile Edge Computing\",\"authors\":\"Zhong Yang, Yuanwei Liu, Yue Chen\",\"doi\":\"10.1109/ICCWorkshops49005.2020.9145457\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A novel non-orthogonal multiple access (NOMA) enabled cache-aided mobile edge computing (MEC) framework is proposed, for minimizing the sum energy consumption. The NOMA strategy enables mobile users to offload computation tasks to the access point (AP) simultaneously, which improves the spectrum efficiency. In this article, the considered resource allocation problem is formulated as a long-term reward maximization problem that involves a joint optimization of task offloading decision, computation resource allocation, and caching decision. To tackle this nontrivial problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy from historical experience. Moreover, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal actions in every state. The proposed BLA based action selection scheme is instantaneously self-correcting, consequently, if the probabilities of two computing models (i.e., local computing and offloading computing) are not equal, the optimal action unveils eventually. Extensive simulations demonstrate that: 1) The proposed cache-aided NOMA MEC framework significantly outperforms the other representative benchmark schemes under various network setups. 2) The effectiveness of the proposed BAL-MAQ-learning algorithm is confirmed from the comparison with the results of conventional reinforcement learning algorithms.\",\"PeriodicalId\":254869,\"journal\":{\"name\":\"2020 IEEE International Conference on Communications Workshops (ICC Workshops)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Communications Workshops (ICC Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCWorkshops49005.2020.9145457\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Communications Workshops (ICC Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWorkshops49005.2020.9145457","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

提出了一种新的非正交多址(NOMA)缓存辅助移动边缘计算(MEC)框架,以最小化总能耗。NOMA策略使移动用户能够同时将计算任务卸载到接入点(AP),从而提高频谱效率。在本文中,所考虑的资源分配问题被表述为一个长期奖励最大化问题,它涉及任务卸载决策、计算资源分配和缓存决策的联合优化。为了解决这个重要的问题,调用了单智能体q -学习(SAQ-learning)算法,从历史经验中学习长期资源分配策略。此外,提出了一种基于贝叶斯学习自动机(BLA)的多智能体q -学习(maq -学习)算法,用于任务卸载决策。更具体地说,提出了一种基于BLA的maq学习智能体在每个状态下选择最优动作的方法。所提出的基于BLA的动作选择方案具有即时自校正性,因此,当两种计算模型(即局部计算和卸载计算)的概率不相等时,最终会揭示出最优动作。大量的仿真表明:1)在各种网络设置下,所提出的缓存辅助NOMA MEC框架显著优于其他代表性基准方案。2)通过与传统强化学习算法的结果对比,验证了所提bal - maq学习算法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distributed Reinforcement Learning for NOMA-Enabled Mobile Edge Computing
A novel non-orthogonal multiple access (NOMA) enabled cache-aided mobile edge computing (MEC) framework is proposed, for minimizing the sum energy consumption. The NOMA strategy enables mobile users to offload computation tasks to the access point (AP) simultaneously, which improves the spectrum efficiency. In this article, the considered resource allocation problem is formulated as a long-term reward maximization problem that involves a joint optimization of task offloading decision, computation resource allocation, and caching decision. To tackle this nontrivial problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy from historical experience. Moreover, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal actions in every state. The proposed BLA based action selection scheme is instantaneously self-correcting, consequently, if the probabilities of two computing models (i.e., local computing and offloading computing) are not equal, the optimal action unveils eventually. Extensive simulations demonstrate that: 1) The proposed cache-aided NOMA MEC framework significantly outperforms the other representative benchmark schemes under various network setups. 2) The effectiveness of the proposed BAL-MAQ-learning algorithm is confirmed from the comparison with the results of conventional reinforcement learning algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信