{"title":"3D Trajectory Design of UAV Based on Deep Reinforcement Learning in Time-varying Scenes","authors":"Qingya Li, Li Guo, Chao Dong, Xidong Mu","doi":"10.1145/3507971.3507982","DOIUrl":null,"url":null,"abstract":"A joint framework is proposed for the 3D trajectory design of an unmanned aerial vehicle (UAV) as an flying base station under the time-varying scenarios of users’ mobility and communication request probability changes. The problem of 3D trajectory design is formulated for maximizing the throughput during a UAV’s flying period while satisfying the rate requirement of all ground users (GUEs). Specifically, we consider that GUEs change their positions and communication request probabilities at each time slot; the UAV needs to predict these changes so that it can design its 3D trajectory in advance to achieve the optimization target. In an effort to solve this pertinent problem, an echo state network (ESN) based prediction algorithm is first proposed for predicting the positions and communication request probabilities of GUEs. Based on these predictions, a Deep Reinforcement Learning (DRL) method is then invoked for finding the optimal deployment locations of UAV in each time slots. The proposed method 1) uses ESN based predictions to represent a part of DRL agent’s state; 2) designs the action and reward for DRL agent to learn the environment and its dynamics; 3) makes optimal strategy under the guidance of a double deep Q network (DDQN). The simulation results show that the UAV can dynamically adjust its trajectory to adapt to time-varying scenarios through our proposed algorithm and throughput gains of about 10.68% are achieved.","PeriodicalId":439757,"journal":{"name":"Proceedings of the 7th International Conference on Communication and Information Processing","volume":"158 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Communication and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3507971.3507982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A joint framework is proposed for the 3D trajectory design of an unmanned aerial vehicle (UAV) as an flying base station under the time-varying scenarios of users’ mobility and communication request probability changes. The problem of 3D trajectory design is formulated for maximizing the throughput during a UAV’s flying period while satisfying the rate requirement of all ground users (GUEs). Specifically, we consider that GUEs change their positions and communication request probabilities at each time slot; the UAV needs to predict these changes so that it can design its 3D trajectory in advance to achieve the optimization target. In an effort to solve this pertinent problem, an echo state network (ESN) based prediction algorithm is first proposed for predicting the positions and communication request probabilities of GUEs. Based on these predictions, a Deep Reinforcement Learning (DRL) method is then invoked for finding the optimal deployment locations of UAV in each time slots. The proposed method 1) uses ESN based predictions to represent a part of DRL agent’s state; 2) designs the action and reward for DRL agent to learn the environment and its dynamics; 3) makes optimal strategy under the guidance of a double deep Q network (DDQN). The simulation results show that the UAV can dynamically adjust its trajectory to adapt to time-varying scenarios through our proposed algorithm and throughput gains of about 10.68% are achieved.