DVB-S2X卫星动态波束跳变:一种多目标深度强化学习方法

Yuchen Zhang, Xin Hu, Rong Chen, Zhili Zhang, Liquan Wang, Weidong Wang
{"title":"DVB-S2X卫星动态波束跳变:一种多目标深度强化学习方法","authors":"Yuchen Zhang, Xin Hu, Rong Chen, Zhili Zhang, Liquan Wang, Weidong Wang","doi":"10.1109/IUCC/DSCI/SmartCNS.2019.00056","DOIUrl":null,"url":null,"abstract":"Dynamic Beam Hopping (DBH) is a crucial technology for adapting to the flexibility of different service configurations in the multi-beam satellite communications market. The conventional beam hopping method, which ignores the intrinsic correlation between decisions, only obtains the optimal solution at the current time, while deep reinforcement learning (DRL) is a typical algorithm for solving sequential decision problems. Therefore, to deal with the DBH problem in the scenario of Differentiated Services (DIFFSERV), this paper designs a multiobjective deep reinforcement learning (MO-DRL) algorithm. Besides, as the demand for the number of beams increases, the complexity of system implementation increase significantly. This paper innovatively proposes a time division multi-action selectionmethod(TD-MASM) tosolvethecurseofdimensionality problem. Under the real condition, the MO-DRL algorithm with the low complexity can ensure the fairness of each cell, improve the throughput to about 5540Mbps, and reduce the delay to about 0.367ms. The simulation results show that when the GA is used to achieve similar effects, the complexity of GA is about 110 times that of the MO-DRL algorithm.","PeriodicalId":410905,"journal":{"name":"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Dynamic Beam Hopping for DVB-S2X Satellite: A Multi-Objective Deep Reinforcement Learning Approach\",\"authors\":\"Yuchen Zhang, Xin Hu, Rong Chen, Zhili Zhang, Liquan Wang, Weidong Wang\",\"doi\":\"10.1109/IUCC/DSCI/SmartCNS.2019.00056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dynamic Beam Hopping (DBH) is a crucial technology for adapting to the flexibility of different service configurations in the multi-beam satellite communications market. The conventional beam hopping method, which ignores the intrinsic correlation between decisions, only obtains the optimal solution at the current time, while deep reinforcement learning (DRL) is a typical algorithm for solving sequential decision problems. Therefore, to deal with the DBH problem in the scenario of Differentiated Services (DIFFSERV), this paper designs a multiobjective deep reinforcement learning (MO-DRL) algorithm. Besides, as the demand for the number of beams increases, the complexity of system implementation increase significantly. This paper innovatively proposes a time division multi-action selectionmethod(TD-MASM) tosolvethecurseofdimensionality problem. Under the real condition, the MO-DRL algorithm with the low complexity can ensure the fairness of each cell, improve the throughput to about 5540Mbps, and reduce the delay to about 0.367ms. The simulation results show that when the GA is used to achieve similar effects, the complexity of GA is about 110 times that of the MO-DRL algorithm.\",\"PeriodicalId\":410905,\"journal\":{\"name\":\"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IUCC/DSCI/SmartCNS.2019.00056\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IUCC/DSCI/SmartCNS.2019.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

在多波束卫星通信市场中,动态跳波束是适应不同业务配置灵活性的一项关键技术。传统的跳波束方法忽略了决策之间的内在相关性,只能得到当前时刻的最优解,而深度强化学习(DRL)是求解序列决策问题的典型算法。因此,为了解决DIFFSERV (Differentiated Services)场景下的DBH问题,本文设计了一种多目标深度强化学习(MO-DRL)算法。此外,随着对波束数量需求的增加,系统实现的复杂性也显著增加。本文创新性地提出了一种时分多动作选择方法(TD-MASM)来解决维数变化问题。在实际条件下,复杂度较低的MO-DRL算法可以保证每个cell的公平性,将吞吐量提高到5540Mbps左右,将延迟降低到0.367ms左右。仿真结果表明,当采用遗传算法达到相似的效果时,遗传算法的复杂度是MO-DRL算法的110倍左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dynamic Beam Hopping for DVB-S2X Satellite: A Multi-Objective Deep Reinforcement Learning Approach
Dynamic Beam Hopping (DBH) is a crucial technology for adapting to the flexibility of different service configurations in the multi-beam satellite communications market. The conventional beam hopping method, which ignores the intrinsic correlation between decisions, only obtains the optimal solution at the current time, while deep reinforcement learning (DRL) is a typical algorithm for solving sequential decision problems. Therefore, to deal with the DBH problem in the scenario of Differentiated Services (DIFFSERV), this paper designs a multiobjective deep reinforcement learning (MO-DRL) algorithm. Besides, as the demand for the number of beams increases, the complexity of system implementation increase significantly. This paper innovatively proposes a time division multi-action selectionmethod(TD-MASM) tosolvethecurseofdimensionality problem. Under the real condition, the MO-DRL algorithm with the low complexity can ensure the fairness of each cell, improve the throughput to about 5540Mbps, and reduce the delay to about 0.367ms. The simulation results show that when the GA is used to achieve similar effects, the complexity of GA is about 110 times that of the MO-DRL algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信