A Parallel Multi-Demonstrations Generative Adversarial Imitation Learning Approach on UAV Target Tracking Decision

IF 3 4区 计算机科学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
Haohui Zhang;Bo Li;Jingyi Huang;Chao Song;Pingkuan He;Evgeny Neretin
{"title":"A Parallel Multi-Demonstrations Generative Adversarial Imitation Learning Approach on UAV Target Tracking Decision","authors":"Haohui Zhang;Bo Li;Jingyi Huang;Chao Song;Pingkuan He;Evgeny Neretin","doi":"10.23919/cje.2024.00.082","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of autonomous decision-making and local convergence that occur in traditional reinforcement learning in the unmanned aerial vehicle (UAV) target tracking, this paper proposes a parallel multi-demonstrations generative adversarial imitation reinforcement learning algorithm to achieve control of UAVs and allow them quickly track the target. First, we classify different expert demonstrations according to different tasks to maximize the model to learn all expert experience. In addition, we develop a parallel multi-demonstrations training framework based on generative adversarial imitation learning, and design strategy update methods for different types of generators, which ensures the generalization ability of imitation learning while improving training efficiency. Finally, we integrate deep reinforcement learning with imitation learning. During the initial training phase, our focus lies in imitation learning while periodically transferring expert knowledge to the pool of reinforcement learning experiences. In the later stages, we increase the proportion of reinforcement learning training and achieve effective UAV target tracking through fine-tuning the weights obtained from reinforcement learning. Experimental results demonstrate that compared to existing reinforcement learning algorithms, our algorithm effectively mitigates issues such as local convergence and completes training in a shorter time frame, ensuring stable target tracking by UAVs.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 4","pages":"1185-1198"},"PeriodicalIF":3.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11151173","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Journal of Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11151173/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Aiming at the problems of autonomous decision-making and local convergence that occur in traditional reinforcement learning in the unmanned aerial vehicle (UAV) target tracking, this paper proposes a parallel multi-demonstrations generative adversarial imitation reinforcement learning algorithm to achieve control of UAVs and allow them quickly track the target. First, we classify different expert demonstrations according to different tasks to maximize the model to learn all expert experience. In addition, we develop a parallel multi-demonstrations training framework based on generative adversarial imitation learning, and design strategy update methods for different types of generators, which ensures the generalization ability of imitation learning while improving training efficiency. Finally, we integrate deep reinforcement learning with imitation learning. During the initial training phase, our focus lies in imitation learning while periodically transferring expert knowledge to the pool of reinforcement learning experiences. In the later stages, we increase the proportion of reinforcement learning training and achieve effective UAV target tracking through fine-tuning the weights obtained from reinforcement learning. Experimental results demonstrate that compared to existing reinforcement learning algorithms, our algorithm effectively mitigates issues such as local convergence and completes training in a shorter time frame, ensuring stable target tracking by UAVs.
无人机目标跟踪决策的并行多演示生成对抗模仿学习方法
针对传统强化学习在无人机目标跟踪中存在的自主决策和局部收敛问题,提出了一种并行多演示生成对抗模仿强化学习算法,以实现对无人机的控制并使其快速跟踪目标。首先,我们根据不同的任务对不同的专家演示进行分类,使模型最大限度地学习所有专家的经验。此外,我们开发了一种基于生成对抗模仿学习的并行多演示训练框架,并设计了针对不同类型生成器的策略更新方法,在保证模仿学习泛化能力的同时提高了训练效率。最后,我们将深度强化学习与模仿学习相结合。在初始训练阶段,我们的重点是模仿学习,同时定期将专家知识转移到强化学习经验库中。在后期,我们增加强化学习训练的比例,并通过微调强化学习得到的权值来实现有效的无人机目标跟踪。实验结果表明,与现有的强化学习算法相比,我们的算法有效地缓解了局部收敛等问题,并在更短的时间内完成训练,保证了无人机对目标的稳定跟踪。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Chinese Journal of Electronics
Chinese Journal of Electronics 工程技术-工程:电子与电气
CiteScore
3.70
自引率
16.70%
发文量
342
审稿时长
12.0 months
期刊介绍: CJE focuses on the emerging fields of electronics, publishing innovative and transformative research papers. Most of the papers published in CJE are from universities and research institutes, presenting their innovative research results. Both theoretical and practical contributions are encouraged, and original research papers reporting novel solutions to the hot topics in electronics are strongly recommended.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信