Optimum Aerial Base Station Deployment for UAV Networks: A Reinforcement Learning Approach

Meng-Chun Hou, Der-Jiunn Deng, Chia‐Ling Wu
{"title":"Optimum Aerial Base Station Deployment for UAV Networks: A Reinforcement Learning Approach","authors":"Meng-Chun Hou, Der-Jiunn Deng, Chia‐Ling Wu","doi":"10.1109/GCWkshps45667.2019.9024648","DOIUrl":null,"url":null,"abstract":"The boom of unmanned aerial vehicles (UAVs) is projected to fundamentally shift paradigms of transportations, logistics, agricultures, and public safety as a dominating unmanned application in following decades. To optimally process assigned tasks, each UAV requires prompt and ubiquitous information provisioning regarding the varying operation conditions, which renders exploiting base stations (BSs) of existing wireless infrastructures a tractable solution. To receive services from a BS, a UAV should stay within the coverage area of a BS, which however limits the operation range of a UAV. This obstacle thus drives the deployment of a special sort of UAV, known as an aerial base station (ABS), to relay signals between a BS and a UAV. Based on different flight paths of UAVs, an ABS should autonomously decide its own flight trajectory so as to maximize the number of UAVs which can receive wireless services. However, the inherently non-stationary environment renders the optimum autonomous deployment of an ABS a challenging issue. Inspired by the merit of interacting with the environment, we consequently propose a reinforcement learning scheme to optimize the flight trajectory of an ABS. To eliminate the engineering concern in the conventional Q-learning scheme that most state-action pairs may not be fully visited in the deployment of an ABS, in this paper, a state-amount-reduction (SAR) k-step Q-learning scheme is proposed to avoid the issue in the conventional Q-learning, so as to maximize the number of UAVs receiving services from an ABS. Through providing analytical foundations and simulation studies, outstanding performance of the proposed schemes is demonstrated as compared with that of the conventional reinforcement learning based ABS deployment.","PeriodicalId":210825,"journal":{"name":"2019 IEEE Globecom Workshops (GC Wkshps)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Globecom Workshops (GC Wkshps)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GCWkshps45667.2019.9024648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The boom of unmanned aerial vehicles (UAVs) is projected to fundamentally shift paradigms of transportations, logistics, agricultures, and public safety as a dominating unmanned application in following decades. To optimally process assigned tasks, each UAV requires prompt and ubiquitous information provisioning regarding the varying operation conditions, which renders exploiting base stations (BSs) of existing wireless infrastructures a tractable solution. To receive services from a BS, a UAV should stay within the coverage area of a BS, which however limits the operation range of a UAV. This obstacle thus drives the deployment of a special sort of UAV, known as an aerial base station (ABS), to relay signals between a BS and a UAV. Based on different flight paths of UAVs, an ABS should autonomously decide its own flight trajectory so as to maximize the number of UAVs which can receive wireless services. However, the inherently non-stationary environment renders the optimum autonomous deployment of an ABS a challenging issue. Inspired by the merit of interacting with the environment, we consequently propose a reinforcement learning scheme to optimize the flight trajectory of an ABS. To eliminate the engineering concern in the conventional Q-learning scheme that most state-action pairs may not be fully visited in the deployment of an ABS, in this paper, a state-amount-reduction (SAR) k-step Q-learning scheme is proposed to avoid the issue in the conventional Q-learning, so as to maximize the number of UAVs receiving services from an ABS. Through providing analytical foundations and simulation studies, outstanding performance of the proposed schemes is demonstrated as compared with that of the conventional reinforcement learning based ABS deployment.
无人机网络空中基站优化部署:一种强化学习方法
无人驾驶飞行器(uav)的蓬勃发展预计将从根本上改变运输,物流,农业和公共安全的范式,成为未来几十年的主导无人应用。为了最优地处理分配的任务,每架无人机都需要针对不同的操作条件提供及时和无处不在的信息,这使得利用现有无线基础设施的基站(BSs)成为一种可处理的解决方案。要接收基站的服务,无人机必须待在基站的覆盖范围内,但这限制了无人机的操作范围。因此,这一障碍推动了一种特殊类型的无人机的部署,被称为空中基站(ABS),用于在基站和无人机之间中继信号。ABS应根据无人机的不同飞行路径,自主决定自己的飞行轨迹,以最大限度地增加接收无线服务的无人机数量。然而,固有的非静止环境使得ABS的最佳自主部署成为一个具有挑战性的问题。基于与环境交互的优点,我们提出了一种强化学习方案来优化ABS的飞行轨迹。为了消除传统q -学习方案中大多数状态-动作对在ABS部署过程中可能无法被完全访问的工程问题,本文提出了一种状态-量减少(SAR) k步q -学习方案来避免传统q -学习方案中的问题。通过提供分析基础和仿真研究,与传统的基于强化学习的ABS部署相比,证明了所提方案的卓越性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信