{"title":"基于目标检测、GRU和注意力的改进GAIL","authors":"Qinghe Liu, Yinghong Tian","doi":"10.1145/3503047.3503063","DOIUrl":null,"url":null,"abstract":"Imitation Learning (IL) learns expert behavior without any reinforcement signal. Thus, it is seen as a potential alternative to Reinforcement Learning (RL) in tasks where it is not easy to design reward functions. However, most models based on IL methods cannot work well when the demonstration is high dimension, and the tasks are complex. We set one realistic-like UAV race simulation environment on AirSim Drone Racing Lab (ADRL) to study the two problems. We propose a new model improves on Generative Adversarial Imitation Learning (GAIL). An object detection network trained by the expert dataset allows the model to use high-dimensional visual inputs while alleviating the data inefficiencies of GAIL. Benefit from the recurrent structure and attention mechanism, the model can control the drone cross the gates and complete the race as if it were an expert. Compared to the primitive GAIL structure, our improved structure showed a 70.6% improvement in average successful crossing over 2000 flight training sessions. The average missed crossing decreased by 18.8% and the average collision decreased by 14.1%.","PeriodicalId":190604,"journal":{"name":"Proceedings of the 3rd International Conference on Advanced Information Science and System","volume":"746 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Improved GAIL Based on Object Detection, GRU, and Attention\",\"authors\":\"Qinghe Liu, Yinghong Tian\",\"doi\":\"10.1145/3503047.3503063\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Imitation Learning (IL) learns expert behavior without any reinforcement signal. Thus, it is seen as a potential alternative to Reinforcement Learning (RL) in tasks where it is not easy to design reward functions. However, most models based on IL methods cannot work well when the demonstration is high dimension, and the tasks are complex. We set one realistic-like UAV race simulation environment on AirSim Drone Racing Lab (ADRL) to study the two problems. We propose a new model improves on Generative Adversarial Imitation Learning (GAIL). An object detection network trained by the expert dataset allows the model to use high-dimensional visual inputs while alleviating the data inefficiencies of GAIL. Benefit from the recurrent structure and attention mechanism, the model can control the drone cross the gates and complete the race as if it were an expert. Compared to the primitive GAIL structure, our improved structure showed a 70.6% improvement in average successful crossing over 2000 flight training sessions. The average missed crossing decreased by 18.8% and the average collision decreased by 14.1%.\",\"PeriodicalId\":190604,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Advanced Information Science and System\",\"volume\":\"746 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Advanced Information Science and System\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3503047.3503063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Advanced Information Science and System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503047.3503063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Improved GAIL Based on Object Detection, GRU, and Attention
Imitation Learning (IL) learns expert behavior without any reinforcement signal. Thus, it is seen as a potential alternative to Reinforcement Learning (RL) in tasks where it is not easy to design reward functions. However, most models based on IL methods cannot work well when the demonstration is high dimension, and the tasks are complex. We set one realistic-like UAV race simulation environment on AirSim Drone Racing Lab (ADRL) to study the two problems. We propose a new model improves on Generative Adversarial Imitation Learning (GAIL). An object detection network trained by the expert dataset allows the model to use high-dimensional visual inputs while alleviating the data inefficiencies of GAIL. Benefit from the recurrent structure and attention mechanism, the model can control the drone cross the gates and complete the race as if it were an expert. Compared to the primitive GAIL structure, our improved structure showed a 70.6% improvement in average successful crossing over 2000 flight training sessions. The average missed crossing decreased by 18.8% and the average collision decreased by 14.1%.