{"title":"采用分段奖励分配的预测性空战决策模型","authors":"Yundi Li, Yinlong Yuan, Yun Cheng, Liang Hua","doi":"10.1007/s40747-024-01556-3","DOIUrl":null,"url":null,"abstract":"<p>In air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predictive air combat decision model with segmented reward allocation\",\"authors\":\"Yundi Li, Yinlong Yuan, Yun Cheng, Liang Hua\",\"doi\":\"10.1007/s40747-024-01556-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-024-01556-3\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01556-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Predictive air combat decision model with segmented reward allocation
In air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.