Yibing Li, Sitong Zhang, Fang Ye, T. Jiang, Yingsong Li
{"title":"基于深度强化学习的无人机路径规划方法","authors":"Yibing Li, Sitong Zhang, Fang Ye, T. Jiang, Yingsong Li","doi":"10.23919/USNC/URSI49741.2020.9321625","DOIUrl":null,"url":null,"abstract":"The path planning of Unmanned Aerial Vehicle (UAV) is a critical component of rescue operation. As impacted by the continuity of the task space and the high dynamics of the aircraft, conventional approaches cannot find the optimal control strategy. Accordingly, in this study, a deep reinforcement learning (DRL)-based UAV path planning method is proposed, enabling the UAV to complete the path planning in a 3D continuous environment. The deep deterministic policy gradient (DDPG) algorithm is employed to enable UAV to autonomously make decisions. Besides, to avoid obstacles, the concepts of connected area and threat function are proposed and adopted in the reward shaping. Lastly, an environment with static obstacles is built, and the agent is trained using the proposed method. As has been proved by the experiments, the proposed algorithm can fit a range of scenarios.","PeriodicalId":443426,"journal":{"name":"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"A UAV Path Planning Method Based on Deep Reinforcement Learning\",\"authors\":\"Yibing Li, Sitong Zhang, Fang Ye, T. Jiang, Yingsong Li\",\"doi\":\"10.23919/USNC/URSI49741.2020.9321625\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The path planning of Unmanned Aerial Vehicle (UAV) is a critical component of rescue operation. As impacted by the continuity of the task space and the high dynamics of the aircraft, conventional approaches cannot find the optimal control strategy. Accordingly, in this study, a deep reinforcement learning (DRL)-based UAV path planning method is proposed, enabling the UAV to complete the path planning in a 3D continuous environment. The deep deterministic policy gradient (DDPG) algorithm is employed to enable UAV to autonomously make decisions. Besides, to avoid obstacles, the concepts of connected area and threat function are proposed and adopted in the reward shaping. Lastly, an environment with static obstacles is built, and the agent is trained using the proposed method. As has been proved by the experiments, the proposed algorithm can fit a range of scenarios.\",\"PeriodicalId\":443426,\"journal\":{\"name\":\"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/USNC/URSI49741.2020.9321625\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/USNC/URSI49741.2020.9321625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A UAV Path Planning Method Based on Deep Reinforcement Learning
The path planning of Unmanned Aerial Vehicle (UAV) is a critical component of rescue operation. As impacted by the continuity of the task space and the high dynamics of the aircraft, conventional approaches cannot find the optimal control strategy. Accordingly, in this study, a deep reinforcement learning (DRL)-based UAV path planning method is proposed, enabling the UAV to complete the path planning in a 3D continuous environment. The deep deterministic policy gradient (DDPG) algorithm is employed to enable UAV to autonomously make decisions. Besides, to avoid obstacles, the concepts of connected area and threat function are proposed and adopted in the reward shaping. Lastly, an environment with static obstacles is built, and the agent is trained using the proposed method. As has been proved by the experiments, the proposed algorithm can fit a range of scenarios.