{"title":"无人机目标跟踪决策的并行多演示生成对抗模仿学习方法","authors":"Haohui Zhang;Bo Li;Jingyi Huang;Chao Song;Pingkuan He;Evgeny Neretin","doi":"10.23919/cje.2024.00.082","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of autonomous decision-making and local convergence that occur in traditional reinforcement learning in the unmanned aerial vehicle (UAV) target tracking, this paper proposes a parallel multi-demonstrations generative adversarial imitation reinforcement learning algorithm to achieve control of UAVs and allow them quickly track the target. First, we classify different expert demonstrations according to different tasks to maximize the model to learn all expert experience. In addition, we develop a parallel multi-demonstrations training framework based on generative adversarial imitation learning, and design strategy update methods for different types of generators, which ensures the generalization ability of imitation learning while improving training efficiency. Finally, we integrate deep reinforcement learning with imitation learning. During the initial training phase, our focus lies in imitation learning while periodically transferring expert knowledge to the pool of reinforcement learning experiences. In the later stages, we increase the proportion of reinforcement learning training and achieve effective UAV target tracking through fine-tuning the weights obtained from reinforcement learning. Experimental results demonstrate that compared to existing reinforcement learning algorithms, our algorithm effectively mitigates issues such as local convergence and completes training in a shorter time frame, ensuring stable target tracking by UAVs.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 4","pages":"1185-1198"},"PeriodicalIF":3.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11151173","citationCount":"0","resultStr":"{\"title\":\"A Parallel Multi-Demonstrations Generative Adversarial Imitation Learning Approach on UAV Target Tracking Decision\",\"authors\":\"Haohui Zhang;Bo Li;Jingyi Huang;Chao Song;Pingkuan He;Evgeny Neretin\",\"doi\":\"10.23919/cje.2024.00.082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problems of autonomous decision-making and local convergence that occur in traditional reinforcement learning in the unmanned aerial vehicle (UAV) target tracking, this paper proposes a parallel multi-demonstrations generative adversarial imitation reinforcement learning algorithm to achieve control of UAVs and allow them quickly track the target. First, we classify different expert demonstrations according to different tasks to maximize the model to learn all expert experience. In addition, we develop a parallel multi-demonstrations training framework based on generative adversarial imitation learning, and design strategy update methods for different types of generators, which ensures the generalization ability of imitation learning while improving training efficiency. Finally, we integrate deep reinforcement learning with imitation learning. During the initial training phase, our focus lies in imitation learning while periodically transferring expert knowledge to the pool of reinforcement learning experiences. In the later stages, we increase the proportion of reinforcement learning training and achieve effective UAV target tracking through fine-tuning the weights obtained from reinforcement learning. Experimental results demonstrate that compared to existing reinforcement learning algorithms, our algorithm effectively mitigates issues such as local convergence and completes training in a shorter time frame, ensuring stable target tracking by UAVs.\",\"PeriodicalId\":50701,\"journal\":{\"name\":\"Chinese Journal of Electronics\",\"volume\":\"34 4\",\"pages\":\"1185-1198\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11151173\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chinese Journal of Electronics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11151173/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Journal of Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11151173/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
A Parallel Multi-Demonstrations Generative Adversarial Imitation Learning Approach on UAV Target Tracking Decision
Aiming at the problems of autonomous decision-making and local convergence that occur in traditional reinforcement learning in the unmanned aerial vehicle (UAV) target tracking, this paper proposes a parallel multi-demonstrations generative adversarial imitation reinforcement learning algorithm to achieve control of UAVs and allow them quickly track the target. First, we classify different expert demonstrations according to different tasks to maximize the model to learn all expert experience. In addition, we develop a parallel multi-demonstrations training framework based on generative adversarial imitation learning, and design strategy update methods for different types of generators, which ensures the generalization ability of imitation learning while improving training efficiency. Finally, we integrate deep reinforcement learning with imitation learning. During the initial training phase, our focus lies in imitation learning while periodically transferring expert knowledge to the pool of reinforcement learning experiences. In the later stages, we increase the proportion of reinforcement learning training and achieve effective UAV target tracking through fine-tuning the weights obtained from reinforcement learning. Experimental results demonstrate that compared to existing reinforcement learning algorithms, our algorithm effectively mitigates issues such as local convergence and completes training in a shorter time frame, ensuring stable target tracking by UAVs.
期刊介绍:
CJE focuses on the emerging fields of electronics, publishing innovative and transformative research papers. Most of the papers published in CJE are from universities and research institutes, presenting their innovative research results. Both theoretical and practical contributions are encouraged, and original research papers reporting novel solutions to the hot topics in electronics are strongly recommended.