{"title":"Research on Autonomous Decision-Making of UCAV Based on Deep Reinforcement Learning","authors":"L. Wang, Hongtao Wei","doi":"10.1109/ictc55111.2022.9778652","DOIUrl":null,"url":null,"abstract":"In order to improve the intelligence level of training opponents in UCAV air combat simulation and the realism and immersion of air combat simulation in 3D space, this paper proposes a deep reinforcement learning algorithm for UCAV autonomous control based on virtual reality technology. A combination of reinforcement learning and Unity3D is used to train UCAV agents to achieve air combat tasks in 3D virtual reality space, and imitation learning is added to improve the efficiency of policy generation. Multiple perceptrons are used to simplify the agent’s acquisition of environmental state data, and reward functions are designed by integrating UCAV angle, speed, and altitude considerations to visualize the entire 3D visualization process of reinforcement learning training UCAV agents to interact with the environment.","PeriodicalId":123022,"journal":{"name":"2022 3rd Information Communication Technologies Conference (ICTC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd Information Communication Technologies Conference (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ictc55111.2022.9778652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In order to improve the intelligence level of training opponents in UCAV air combat simulation and the realism and immersion of air combat simulation in 3D space, this paper proposes a deep reinforcement learning algorithm for UCAV autonomous control based on virtual reality technology. A combination of reinforcement learning and Unity3D is used to train UCAV agents to achieve air combat tasks in 3D virtual reality space, and imitation learning is added to improve the efficiency of policy generation. Multiple perceptrons are used to simplify the agent’s acquisition of environmental state data, and reward functions are designed by integrating UCAV angle, speed, and altitude considerations to visualize the entire 3D visualization process of reinforcement learning training UCAV agents to interact with the environment.