noma增强机器人路径设计:基于无线电地图的机器学习方法

Ruikang Zhong, Xiao Liu, Yuanwei Liu, Di Zhang, Yue Chen
{"title":"noma增强机器人路径设计:基于无线电地图的机器学习方法","authors":"Ruikang Zhong, Xiao Liu, Yuanwei Liu, Di Zhang, Yue Chen","doi":"10.1109/ICCWorkshops50388.2021.9473594","DOIUrl":null,"url":null,"abstract":"A communication enabled indoor intelligent robots (IRs) service framework is proposed, where the non-orthogonal multiple access (NOMA) technique is adopted to enhance the data rate and user fairness. Build on the proposed communication model, motions of IRs and the down-link power allocation policy are jointly optimized to maximize the mission efficiency and communication reliability of IRs. In an effort to find the optimal path for IRs from the initial point to their mission destinations, a novel reinforcement learning approach named deep transfer deterministic policy gradient (DT-DPG) algorithm is proposed. In order to save the training time and hardware costs, the radio map is investigated and provided to the agent as a virtual training environment. Our simulation demonstrates that 1) The participation of the NOMA technique effectively improves the communication reliability of IRs; 2) The radio map is qualified to be a virtual training environment, and its statistical channel state information improves training efficiency by about 30%; 3) The proposed algorithm is superior to the deep deterministic policy gradient (DDPG) algorithm in terms of the optimization performance, training time, and anti-local optimum ability.","PeriodicalId":127186,"journal":{"name":"2021 IEEE International Conference on Communications Workshops (ICC Workshops)","volume":"344 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Path Design for NOMA-Enhanced Robots: A Machine Learning Approach with Radio Map\",\"authors\":\"Ruikang Zhong, Xiao Liu, Yuanwei Liu, Di Zhang, Yue Chen\",\"doi\":\"10.1109/ICCWorkshops50388.2021.9473594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A communication enabled indoor intelligent robots (IRs) service framework is proposed, where the non-orthogonal multiple access (NOMA) technique is adopted to enhance the data rate and user fairness. Build on the proposed communication model, motions of IRs and the down-link power allocation policy are jointly optimized to maximize the mission efficiency and communication reliability of IRs. In an effort to find the optimal path for IRs from the initial point to their mission destinations, a novel reinforcement learning approach named deep transfer deterministic policy gradient (DT-DPG) algorithm is proposed. In order to save the training time and hardware costs, the radio map is investigated and provided to the agent as a virtual training environment. Our simulation demonstrates that 1) The participation of the NOMA technique effectively improves the communication reliability of IRs; 2) The radio map is qualified to be a virtual training environment, and its statistical channel state information improves training efficiency by about 30%; 3) The proposed algorithm is superior to the deep deterministic policy gradient (DDPG) algorithm in terms of the optimization performance, training time, and anti-local optimum ability.\",\"PeriodicalId\":127186,\"journal\":{\"name\":\"2021 IEEE International Conference on Communications Workshops (ICC Workshops)\",\"volume\":\"344 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Communications Workshops (ICC Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCWorkshops50388.2021.9473594\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Communications Workshops (ICC Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWorkshops50388.2021.9473594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

提出了一种基于通信的室内智能机器人服务框架,该框架采用非正交多址(NOMA)技术来提高数据速率和用户公平性。在该通信模型的基础上,联合优化红外雷达的运动和下行功率分配策略,最大限度地提高红外雷达的任务效率和通信可靠性。为了寻找人工智能从初始点到任务目的地的最优路径,提出了一种新的强化学习方法——深度转移确定性策略梯度(DT-DPG)算法。为了节省训练时间和硬件成本,研究无线电地图并提供给智能体作为虚拟训练环境。仿真结果表明:1)NOMA技术的加入有效提高了红外雷达的通信可靠性;2)无线地图具备虚拟训练环境的条件,其信道状态信息统计可使训练效率提高30%左右;3)该算法在优化性能、训练时间、抗局部最优能力等方面均优于深度确定性策略梯度(deep deterministic policy gradient, DDPG)算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Path Design for NOMA-Enhanced Robots: A Machine Learning Approach with Radio Map
A communication enabled indoor intelligent robots (IRs) service framework is proposed, where the non-orthogonal multiple access (NOMA) technique is adopted to enhance the data rate and user fairness. Build on the proposed communication model, motions of IRs and the down-link power allocation policy are jointly optimized to maximize the mission efficiency and communication reliability of IRs. In an effort to find the optimal path for IRs from the initial point to their mission destinations, a novel reinforcement learning approach named deep transfer deterministic policy gradient (DT-DPG) algorithm is proposed. In order to save the training time and hardware costs, the radio map is investigated and provided to the agent as a virtual training environment. Our simulation demonstrates that 1) The participation of the NOMA technique effectively improves the communication reliability of IRs; 2) The radio map is qualified to be a virtual training environment, and its statistical channel state information improves training efficiency by about 30%; 3) The proposed algorithm is superior to the deep deterministic policy gradient (DDPG) algorithm in terms of the optimization performance, training time, and anti-local optimum ability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信