Tonghuazhai Xu, Nan Wang, Hongtao Lin, Zhaomei Sun
{"title":"UAV Autonomous Reconnaissance Route Planning Based on Deep Reinforcement Learning","authors":"Tonghuazhai Xu, Nan Wang, Hongtao Lin, Zhaomei Sun","doi":"10.1109/ICUS48101.2019.8995935","DOIUrl":null,"url":null,"abstract":"In order to improve the autonomous reconnaissance efficiency of unmanned aerial vehicle (UAV) in an uncertain environment, situation and observation information acquired by UAV are input into the replay buffer. Model-free training is performed on the data of the replay buffer by deep reinforcement learning (DRL) method, so as to generate the corresponding network model. The reward function is designed for UAV regional reconnaissance missions to further improve the generalization ability of the model. The simulation results show that the UAV autonomous reconnaissance route planning algorithm based on DRL has a high degree of sustainable coverage and its patrol path is unpredictable.","PeriodicalId":344181,"journal":{"name":"2019 IEEE International Conference on Unmanned Systems (ICUS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Unmanned Systems (ICUS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICUS48101.2019.8995935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In order to improve the autonomous reconnaissance efficiency of unmanned aerial vehicle (UAV) in an uncertain environment, situation and observation information acquired by UAV are input into the replay buffer. Model-free training is performed on the data of the replay buffer by deep reinforcement learning (DRL) method, so as to generate the corresponding network model. The reward function is designed for UAV regional reconnaissance missions to further improve the generalization ability of the model. The simulation results show that the UAV autonomous reconnaissance route planning algorithm based on DRL has a high degree of sustainable coverage and its patrol path is unpredictable.