{"title":"基于强化学习的分离管理中基于注意和长短期记忆的状态编码方法的比较","authors":"D.J. Groot, J. Ellerbroek, J.M. Hoekstra","doi":"10.1016/j.engappai.2025.111592","DOIUrl":null,"url":null,"abstract":"<div><div>Reinforcement learning (RL) is a method that has been studied extensively for the task of conflict-resolution and separation management within air traffic control, offering advantages over analytical methods. One key challenge associated with RL for this task is the construction of the input vector. Because the number of agents in the airspace varies, methods that can handle dynamic number of agents are required. Various methods exist, for example, selecting a fixed number of aircraft, or using methods such as recurrent neural networks or attention to encode the information. Multiple studies have shown promising results using these encoder methods, however, studies comparing these methods are limited and the results remain inconclusive on which method works better. To address this issue, this paper compares different input encoding methods: three different attention methods – scaled dot-product, additive and context aware attention – and long short-term memory (LSTM) with three different sorting strategies. These methods are used as input encoders for different models trained with the Soft Actor–Critic algorithm for separation management in high traffic density scenarios. It is found that additive attention is the most effective at increasing the total safety and maximizing path efficiency, outperforming the commonly used scaled dot-product attention and LSTM. Additionally, it is shown that the order of the input sequence significantly impacts the performance of the LSTM based input encoder. This is in contrast with the attention methods, which are sequence-independent and therefore do not suffer from biases introduced by the order of the input sequence.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111592"},"PeriodicalIF":7.5000,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing attention-based methods with long short-term memory for state encoding in reinforcement learning-based separation management\",\"authors\":\"D.J. Groot, J. Ellerbroek, J.M. Hoekstra\",\"doi\":\"10.1016/j.engappai.2025.111592\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Reinforcement learning (RL) is a method that has been studied extensively for the task of conflict-resolution and separation management within air traffic control, offering advantages over analytical methods. One key challenge associated with RL for this task is the construction of the input vector. Because the number of agents in the airspace varies, methods that can handle dynamic number of agents are required. Various methods exist, for example, selecting a fixed number of aircraft, or using methods such as recurrent neural networks or attention to encode the information. Multiple studies have shown promising results using these encoder methods, however, studies comparing these methods are limited and the results remain inconclusive on which method works better. To address this issue, this paper compares different input encoding methods: three different attention methods – scaled dot-product, additive and context aware attention – and long short-term memory (LSTM) with three different sorting strategies. These methods are used as input encoders for different models trained with the Soft Actor–Critic algorithm for separation management in high traffic density scenarios. It is found that additive attention is the most effective at increasing the total safety and maximizing path efficiency, outperforming the commonly used scaled dot-product attention and LSTM. Additionally, it is shown that the order of the input sequence significantly impacts the performance of the LSTM based input encoder. This is in contrast with the attention methods, which are sequence-independent and therefore do not suffer from biases introduced by the order of the input sequence.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"159 \",\"pages\":\"Article 111592\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625015945\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625015945","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Comparing attention-based methods with long short-term memory for state encoding in reinforcement learning-based separation management
Reinforcement learning (RL) is a method that has been studied extensively for the task of conflict-resolution and separation management within air traffic control, offering advantages over analytical methods. One key challenge associated with RL for this task is the construction of the input vector. Because the number of agents in the airspace varies, methods that can handle dynamic number of agents are required. Various methods exist, for example, selecting a fixed number of aircraft, or using methods such as recurrent neural networks or attention to encode the information. Multiple studies have shown promising results using these encoder methods, however, studies comparing these methods are limited and the results remain inconclusive on which method works better. To address this issue, this paper compares different input encoding methods: three different attention methods – scaled dot-product, additive and context aware attention – and long short-term memory (LSTM) with three different sorting strategies. These methods are used as input encoders for different models trained with the Soft Actor–Critic algorithm for separation management in high traffic density scenarios. It is found that additive attention is the most effective at increasing the total safety and maximizing path efficiency, outperforming the commonly used scaled dot-product attention and LSTM. Additionally, it is shown that the order of the input sequence significantly impacts the performance of the LSTM based input encoder. This is in contrast with the attention methods, which are sequence-independent and therefore do not suffer from biases introduced by the order of the input sequence.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.