基于深度强化学习和变压器网络的空战智能机动决策方法。

IF 2.1 3区 物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY
Entropy Pub Date : 2024-11-29 DOI:10.3390/e26121036
Wentao Li, Feng Fang, Dongliang Peng, Shuning Han
{"title":"基于深度强化学习和变压器网络的空战智能机动决策方法。","authors":"Wentao Li, Feng Fang, Dongliang Peng, Shuning Han","doi":"10.3390/e26121036","DOIUrl":null,"url":null,"abstract":"<p><p>The traditional maneuver decision-making approaches are highly dependent on accurate and complete situation information, and their decision-making quality becomes poor when opponent information is occasionally missing in complex electromagnetic environments. In order to solve this problem, an autonomous maneuver decision-making approach is developed based on deep reinforcement learning (DRL) architecture. Meanwhile, a Transformer network is integrated into the actor and critic networks, which can find the potential dependency relationships among the time series trajectory data. By using these relationships, the information loss is partially compensated, which leads to maneuvering decisions being more accurate. The issues of limited experience samples, low sampling efficiency, and poor stability in the agent training state appear when the Transformer network is introduced into DRL. To address these issues, the measures of designing an effective decision-making reward, a prioritized sampling method, and a dynamic learning rate adjustment mechanism are proposed. Numerous simulation results show that the proposed approach outperforms the traditional DRL algorithms, with a higher win rate in the case of opponent information loss.</p>","PeriodicalId":11694,"journal":{"name":"Entropy","volume":"26 12","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727636/pdf/","citationCount":"0","resultStr":"{\"title\":\"An Intelligent Maneuver Decision-Making Approach for Air Combat Based on Deep Reinforcement Learning and Transformer Networks.\",\"authors\":\"Wentao Li, Feng Fang, Dongliang Peng, Shuning Han\",\"doi\":\"10.3390/e26121036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The traditional maneuver decision-making approaches are highly dependent on accurate and complete situation information, and their decision-making quality becomes poor when opponent information is occasionally missing in complex electromagnetic environments. In order to solve this problem, an autonomous maneuver decision-making approach is developed based on deep reinforcement learning (DRL) architecture. Meanwhile, a Transformer network is integrated into the actor and critic networks, which can find the potential dependency relationships among the time series trajectory data. By using these relationships, the information loss is partially compensated, which leads to maneuvering decisions being more accurate. The issues of limited experience samples, low sampling efficiency, and poor stability in the agent training state appear when the Transformer network is introduced into DRL. To address these issues, the measures of designing an effective decision-making reward, a prioritized sampling method, and a dynamic learning rate adjustment mechanism are proposed. Numerous simulation results show that the proposed approach outperforms the traditional DRL algorithms, with a higher win rate in the case of opponent information loss.</p>\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":\"26 12\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727636/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e26121036\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e26121036","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

传统的机动决策方法高度依赖于准确完整的态势信息,在复杂的电磁环境中,当对手信息偶尔缺失时,其决策质量就会下降。为了解决这一问题,提出了一种基于深度强化学习(DRL)架构的自主机动决策方法。同时,将一个Transformer网络集成到演员和评论家网络中,该网络可以发现时间序列轨迹数据之间潜在的依赖关系。通过使用这些关系,可以部分补偿信息损失,从而使机动决策更加准确。将Transformer网络引入DRL后,会出现经验样本有限、采样效率低、智能体训练状态稳定性差的问题。针对这些问题,提出了设计有效的决策奖励、优先抽样方法和动态学习率调整机制等措施。大量仿真结果表明,该方法优于传统的DRL算法,在对手信息丢失的情况下具有更高的胜率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Intelligent Maneuver Decision-Making Approach for Air Combat Based on Deep Reinforcement Learning and Transformer Networks.

The traditional maneuver decision-making approaches are highly dependent on accurate and complete situation information, and their decision-making quality becomes poor when opponent information is occasionally missing in complex electromagnetic environments. In order to solve this problem, an autonomous maneuver decision-making approach is developed based on deep reinforcement learning (DRL) architecture. Meanwhile, a Transformer network is integrated into the actor and critic networks, which can find the potential dependency relationships among the time series trajectory data. By using these relationships, the information loss is partially compensated, which leads to maneuvering decisions being more accurate. The issues of limited experience samples, low sampling efficiency, and poor stability in the agent training state appear when the Transformer network is introduced into DRL. To address these issues, the measures of designing an effective decision-making reward, a prioritized sampling method, and a dynamic learning rate adjustment mechanism are proposed. Numerous simulation results show that the proposed approach outperforms the traditional DRL algorithms, with a higher win rate in the case of opponent information loss.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Entropy
Entropy PHYSICS, MULTIDISCIPLINARY-
CiteScore
4.90
自引率
11.10%
发文量
1580
审稿时长
21.05 days
期刊介绍: Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信