A Parallel Multi-Demonstrations Generative Adversarial Imitation Learning Approach on UAV Target Tracking Decision

IF 3 4区计算机科学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Chinese Journal of Electronics Pub Date : 2025-07-01 DOI:10.23919/cje.2024.00.082

Haohui Zhang;Bo Li;Jingyi Huang;Chao Song;Pingkuan He;Evgeny Neretin

{"title":"A Parallel Multi-Demonstrations Generative Adversarial Imitation Learning Approach on UAV Target Tracking Decision","authors":"Haohui Zhang;Bo Li;Jingyi Huang;Chao Song;Pingkuan He;Evgeny Neretin","doi":"10.23919/cje.2024.00.082","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of autonomous decision-making and local convergence that occur in traditional reinforcement learning in the unmanned aerial vehicle (UAV) target tracking, this paper proposes a parallel multi-demonstrations generative adversarial imitation reinforcement learning algorithm to achieve control of UAVs and allow them quickly track the target. First, we classify different expert demonstrations according to different tasks to maximize the model to learn all expert experience. In addition, we develop a parallel multi-demonstrations training framework based on generative adversarial imitation learning, and design strategy update methods for different types of generators, which ensures the generalization ability of imitation learning while improving training efficiency. Finally, we integrate deep reinforcement learning with imitation learning. During the initial training phase, our focus lies in imitation learning while periodically transferring expert knowledge to the pool of reinforcement learning experiences. In the later stages, we increase the proportion of reinforcement learning training and achieve effective UAV target tracking through fine-tuning the weights obtained from reinforcement learning. Experimental results demonstrate that compared to existing reinforcement learning algorithms, our algorithm effectively mitigates issues such as local convergence and completes training in a shorter time frame, ensuring stable target tracking by UAVs.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 4","pages":"1185-1198"},"PeriodicalIF":3.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11151173","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Journal of Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11151173/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Aiming at the problems of autonomous decision-making and local convergence that occur in traditional reinforcement learning in the unmanned aerial vehicle (UAV) target tracking, this paper proposes a parallel multi-demonstrations generative adversarial imitation reinforcement learning algorithm to achieve control of UAVs and allow them quickly track the target. First, we classify different expert demonstrations according to different tasks to maximize the model to learn all expert experience. In addition, we develop a parallel multi-demonstrations training framework based on generative adversarial imitation learning, and design strategy update methods for different types of generators, which ensures the generalization ability of imitation learning while improving training efficiency. Finally, we integrate deep reinforcement learning with imitation learning. During the initial training phase, our focus lies in imitation learning while periodically transferring expert knowledge to the pool of reinforcement learning experiences. In the later stages, we increase the proportion of reinforcement learning training and achieve effective UAV target tracking through fine-tuning the weights obtained from reinforcement learning. Experimental results demonstrate that compared to existing reinforcement learning algorithms, our algorithm effectively mitigates issues such as local convergence and completes training in a shorter time frame, ensuring stable target tracking by UAVs.

查看原文本刊更多论文

无人机目标跟踪决策的并行多演示生成对抗模仿学习方法

针对传统强化学习在无人机目标跟踪中存在的自主决策和局部收敛问题，提出了一种并行多演示生成对抗模仿强化学习算法，以实现对无人机的控制并使其快速跟踪目标。首先，我们根据不同的任务对不同的专家演示进行分类，使模型最大限度地学习所有专家的经验。此外，我们开发了一种基于生成对抗模仿学习的并行多演示训练框架，并设计了针对不同类型生成器的策略更新方法，在保证模仿学习泛化能力的同时提高了训练效率。最后，我们将深度强化学习与模仿学习相结合。在初始训练阶段，我们的重点是模仿学习，同时定期将专家知识转移到强化学习经验库中。在后期，我们增加强化学习训练的比例，并通过微调强化学习得到的权值来实现有效的无人机目标跟踪。实验结果表明，与现有的强化学习算法相比，我们的算法有效地缓解了局部收敛等问题，并在更短的时间内完成训练，保证了无人机对目标的稳定跟踪。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Chinese Journal of Electronics 工程技术-工程：电子与电气

CiteScore

3.70

自引率

16.70%

发文量

342

审稿时长

12.0 months

期刊介绍： CJE focuses on the emerging fields of electronics, publishing innovative and transformative research papers. Most of the papers published in CJE are from universities and research institutes, presenting their innovative research results. Both theoretical and practical contributions are encouraged, and original research papers reporting novel solutions to the hot topics in electronics are strongly recommended.