战术意图驱动的自主空战行为生成方法

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xingyu Wang, Zhen Yang, Shiyuan Chai, Jichuan Huang, Yupeng He, Deyun Zhou
{"title":"战术意图驱动的自主空战行为生成方法","authors":"Xingyu Wang, Zhen Yang, Shiyuan Chai, Jichuan Huang, Yupeng He, Deyun Zhou","doi":"10.1007/s40747-024-01685-9","DOIUrl":null,"url":null,"abstract":"<p>With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"262 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tactical intent-driven autonomous air combat behavior generation method\",\"authors\":\"Xingyu Wang, Zhen Yang, Shiyuan Chai, Jichuan Huang, Yupeng He, Deyun Zhou\",\"doi\":\"10.1007/s40747-024-01685-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"262 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-024-01685-9\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01685-9","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

随着人工智能的快速发展和深入应用,现代空战正逐步向智能化方向发展。尽管深度强化学习算法在空战中取得了巨大的进步,但它们仍然面临着诸如对抗性策略的可解释性差和可转移性弱等挑战。在这方面,本文提出了一种战术意图驱动的自主空战行为生成方法。首先,本文探讨了最优策略与奖励之间的映射关系,证明了稀疏奖励与密集奖励组合对策略的不利影响。以此为基础,分析了飞行员行为的决策过程,建立了从意图到行为的奖励映射模型。最后,针对深度强化学习算法在大规模状态-动作空间中稳定性差、收敛速度慢的问题,设计了一种抗噪声多步DQN算法,该算法不仅提高了值函数逼近的精度,而且提高了空间探索和网络泛化的效率。通过实验证明了稀疏奖励和密集奖励之间的冲突。与其他算法相比,本文提出的算法具有优越的性能和稳定性。更直观的是,不同意图下的策略表现出较强的可解释性和灵活性,可为空战智能决策提供战术支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Tactical intent-driven autonomous air combat behavior generation method

With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信