无人机网络空中基站优化部署:一种强化学习方法

2019 IEEE Globecom Workshops (GC Wkshps) Pub Date : 2019-12-01 DOI:10.1109/GCWkshps45667.2019.9024648

Meng-Chun Hou, Der-Jiunn Deng, Chia‐Ling Wu

{"title":"无人机网络空中基站优化部署:一种强化学习方法","authors":"Meng-Chun Hou, Der-Jiunn Deng, Chia‐Ling Wu","doi":"10.1109/GCWkshps45667.2019.9024648","DOIUrl":null,"url":null,"abstract":"The boom of unmanned aerial vehicles (UAVs) is projected to fundamentally shift paradigms of transportations, logistics, agricultures, and public safety as a dominating unmanned application in following decades. To optimally process assigned tasks, each UAV requires prompt and ubiquitous information provisioning regarding the varying operation conditions, which renders exploiting base stations (BSs) of existing wireless infrastructures a tractable solution. To receive services from a BS, a UAV should stay within the coverage area of a BS, which however limits the operation range of a UAV. This obstacle thus drives the deployment of a special sort of UAV, known as an aerial base station (ABS), to relay signals between a BS and a UAV. Based on different flight paths of UAVs, an ABS should autonomously decide its own flight trajectory so as to maximize the number of UAVs which can receive wireless services. However, the inherently non-stationary environment renders the optimum autonomous deployment of an ABS a challenging issue. Inspired by the merit of interacting with the environment, we consequently propose a reinforcement learning scheme to optimize the flight trajectory of an ABS. To eliminate the engineering concern in the conventional Q-learning scheme that most state-action pairs may not be fully visited in the deployment of an ABS, in this paper, a state-amount-reduction (SAR) k-step Q-learning scheme is proposed to avoid the issue in the conventional Q-learning, so as to maximize the number of UAVs receiving services from an ABS. Through providing analytical foundations and simulation studies, outstanding performance of the proposed schemes is demonstrated as compared with that of the conventional reinforcement learning based ABS deployment.","PeriodicalId":210825,"journal":{"name":"2019 IEEE Globecom Workshops (GC Wkshps)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Optimum Aerial Base Station Deployment for UAV Networks: A Reinforcement Learning Approach\",\"authors\":\"Meng-Chun Hou, Der-Jiunn Deng, Chia‐Ling Wu\",\"doi\":\"10.1109/GCWkshps45667.2019.9024648\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The boom of unmanned aerial vehicles (UAVs) is projected to fundamentally shift paradigms of transportations, logistics, agricultures, and public safety as a dominating unmanned application in following decades. To optimally process assigned tasks, each UAV requires prompt and ubiquitous information provisioning regarding the varying operation conditions, which renders exploiting base stations (BSs) of existing wireless infrastructures a tractable solution. To receive services from a BS, a UAV should stay within the coverage area of a BS, which however limits the operation range of a UAV. This obstacle thus drives the deployment of a special sort of UAV, known as an aerial base station (ABS), to relay signals between a BS and a UAV. Based on different flight paths of UAVs, an ABS should autonomously decide its own flight trajectory so as to maximize the number of UAVs which can receive wireless services. However, the inherently non-stationary environment renders the optimum autonomous deployment of an ABS a challenging issue. Inspired by the merit of interacting with the environment, we consequently propose a reinforcement learning scheme to optimize the flight trajectory of an ABS. To eliminate the engineering concern in the conventional Q-learning scheme that most state-action pairs may not be fully visited in the deployment of an ABS, in this paper, a state-amount-reduction (SAR) k-step Q-learning scheme is proposed to avoid the issue in the conventional Q-learning, so as to maximize the number of UAVs receiving services from an ABS. Through providing analytical foundations and simulation studies, outstanding performance of the proposed schemes is demonstrated as compared with that of the conventional reinforcement learning based ABS deployment.\",\"PeriodicalId\":210825,\"journal\":{\"name\":\"2019 IEEE Globecom Workshops (GC Wkshps)\",\"volume\":\"151 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Globecom Workshops (GC Wkshps)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GCWkshps45667.2019.9024648\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Globecom Workshops (GC Wkshps)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GCWkshps45667.2019.9024648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

无人驾驶飞行器(uav)的蓬勃发展预计将从根本上改变运输，物流，农业和公共安全的范式，成为未来几十年的主导无人应用。为了最优地处理分配的任务，每架无人机都需要针对不同的操作条件提供及时和无处不在的信息，这使得利用现有无线基础设施的基站(BSs)成为一种可处理的解决方案。要接收基站的服务，无人机必须待在基站的覆盖范围内，但这限制了无人机的操作范围。因此，这一障碍推动了一种特殊类型的无人机的部署，被称为空中基站(ABS)，用于在基站和无人机之间中继信号。ABS应根据无人机的不同飞行路径，自主决定自己的飞行轨迹，以最大限度地增加接收无线服务的无人机数量。然而，固有的非静止环境使得ABS的最佳自主部署成为一个具有挑战性的问题。基于与环境交互的优点，我们提出了一种强化学习方案来优化ABS的飞行轨迹。为了消除传统q -学习方案中大多数状态-动作对在ABS部署过程中可能无法被完全访问的工程问题，本文提出了一种状态-量减少(SAR) k步q -学习方案来避免传统q -学习方案中的问题。通过提供分析基础和仿真研究，与传统的基于强化学习的ABS部署相比，证明了所提方案的卓越性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimum Aerial Base Station Deployment for UAV Networks: A Reinforcement Learning Approach

The boom of unmanned aerial vehicles (UAVs) is projected to fundamentally shift paradigms of transportations, logistics, agricultures, and public safety as a dominating unmanned application in following decades. To optimally process assigned tasks, each UAV requires prompt and ubiquitous information provisioning regarding the varying operation conditions, which renders exploiting base stations (BSs) of existing wireless infrastructures a tractable solution. To receive services from a BS, a UAV should stay within the coverage area of a BS, which however limits the operation range of a UAV. This obstacle thus drives the deployment of a special sort of UAV, known as an aerial base station (ABS), to relay signals between a BS and a UAV. Based on different flight paths of UAVs, an ABS should autonomously decide its own flight trajectory so as to maximize the number of UAVs which can receive wireless services. However, the inherently non-stationary environment renders the optimum autonomous deployment of an ABS a challenging issue. Inspired by the merit of interacting with the environment, we consequently propose a reinforcement learning scheme to optimize the flight trajectory of an ABS. To eliminate the engineering concern in the conventional Q-learning scheme that most state-action pairs may not be fully visited in the deployment of an ABS, in this paper, a state-amount-reduction (SAR) k-step Q-learning scheme is proposed to avoid the issue in the conventional Q-learning, so as to maximize the number of UAVs receiving services from an ABS. Through providing analytical foundations and simulation studies, outstanding performance of the proposed schemes is demonstrated as compared with that of the conventional reinforcement learning based ABS deployment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE Globecom Workshops (GC Wkshps)

自引率

0.00%

发文量