学习飞行接近的类人决策：环境和模仿学习方法

IF 7.6 1区工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY

Transportation Research Part C-Emerging Technologies Pub Date : 2025-05-03 DOI:10.1016/j.trc.2025.105142

Haifeng Liu, Dongyue Guo, Shizhong Zhou, Zheng Zhang, Hongyu Yang, Yi Lin

{"title":"学习飞行接近的类人决策：环境和模仿学习方法","authors":"Haifeng Liu, Dongyue Guo, Shizhong Zhou, Zheng Zhang, Hongyu Yang, Yi Lin","doi":"10.1016/j.trc.2025.105142","DOIUrl":null,"url":null,"abstract":"<div><div><span><span><sup>1</sup></span></span> Flight approach in terminal airspace is a challenging task with high aircraft density and maneuvering in air traffic control decisions. Existing reinforcement learning methods were only studied based on simulation environments, and also suffered from sparse reward and state-space explosion problems. In this work, an imitation learning-based autonomous framework, AppGAIL, is proposed to achieve the flight approach decision based on human expert demonstrations, which has the ability to eliminate the requirement of designing handcrafted rewards. To cope with the state-space explosion problem, a cylindrical grid airspace model is designed to convert the earth space to discrete airspace, obtaining the transformation of the real-time traffic situation by near distance–identical cells. The generative adversarial mechanism is applied to achieve imitation learning by distinguishing the source of the input observations (state and action sequences with a sliding window), i.e., from the generator or expert demonstrations. Since all human expert demonstrations are safe operations, limiting the model to learn knowledge of flight conflicts and confusing the generator to plan conflict trajectories, a conflict-aware discriminator is proposed to detect possible conflicts by a multi-task framework with learnable weights, which further supports the adversarial training. The real-world traffic dataset is applied to validate the proposed method, in which several custom metrics are proposed to support the real-world air traffic control. The experimental results demonstrate that the AppGAIL outperforms other baseline methods, achieving only 0.67% potential conflict rate and 3.732 kilometers dynamic time wrapping distance. Most importantly, all proposed technical modules contribute the desired performance improvement. Additionally, multi-aircraft planning and real-time factors can also be resolved to improve the applicability of the proposed method.</div></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"176 ","pages":"Article 105142"},"PeriodicalIF":7.6000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning human-like decisions for flight approach: Environment and an imitation learning method\",\"authors\":\"Haifeng Liu, Dongyue Guo, Shizhong Zhou, Zheng Zhang, Hongyu Yang, Yi Lin\",\"doi\":\"10.1016/j.trc.2025.105142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div><span><span><sup>1</sup></span></span> Flight approach in terminal airspace is a challenging task with high aircraft density and maneuvering in air traffic control decisions. Existing reinforcement learning methods were only studied based on simulation environments, and also suffered from sparse reward and state-space explosion problems. In this work, an imitation learning-based autonomous framework, AppGAIL, is proposed to achieve the flight approach decision based on human expert demonstrations, which has the ability to eliminate the requirement of designing handcrafted rewards. To cope with the state-space explosion problem, a cylindrical grid airspace model is designed to convert the earth space to discrete airspace, obtaining the transformation of the real-time traffic situation by near distance–identical cells. The generative adversarial mechanism is applied to achieve imitation learning by distinguishing the source of the input observations (state and action sequences with a sliding window), i.e., from the generator or expert demonstrations. Since all human expert demonstrations are safe operations, limiting the model to learn knowledge of flight conflicts and confusing the generator to plan conflict trajectories, a conflict-aware discriminator is proposed to detect possible conflicts by a multi-task framework with learnable weights, which further supports the adversarial training. The real-world traffic dataset is applied to validate the proposed method, in which several custom metrics are proposed to support the real-world air traffic control. The experimental results demonstrate that the AppGAIL outperforms other baseline methods, achieving only 0.67% potential conflict rate and 3.732 kilometers dynamic time wrapping distance. Most importantly, all proposed technical modules contribute the desired performance improvement. Additionally, multi-aircraft planning and real-time factors can also be resolved to improve the applicability of the proposed method.</div></div>\",\"PeriodicalId\":54417,\"journal\":{\"name\":\"Transportation Research Part C-Emerging Technologies\",\"volume\":\"176 \",\"pages\":\"Article 105142\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Research Part C-Emerging Technologies\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0968090X25001469\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TRANSPORTATION SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X25001469","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

终端空域飞行进近是空中交通管制决策中一项具有挑战性的任务，飞机密度大，机动性强。现有的强化学习方法仅基于仿真环境进行研究，并且存在奖励稀疏和状态空间爆炸问题。本文提出了一种基于模仿学习的自主框架AppGAIL来实现基于人类专家演示的飞行进近决策，该框架能够消除手工设计奖励的需求。针对状态空间爆炸问题，设计了圆柱网格空间模型，将地球空间转换为离散空间，通过近距离相同的单元获得实时交通状况的转换。生成对抗机制通过区分输入观察的来源（带滑动窗口的状态和动作序列）来实现模仿学习，即从生成器或专家演示中区分。由于所有人类专家演示都是安全操作，限制了模型学习飞行冲突的知识，并使生成器难以规划冲突轨迹，因此提出了冲突感知判别器，通过具有可学习权的多任务框架检测可能的冲突，进一步支持对抗性训练。应用现实世界的交通数据集来验证所提出的方法，其中提出了几个自定义指标来支持现实世界的空中交通管制。实验结果表明，该方法的潜在冲突率仅为0.67%，动态时间包裹距离仅为3.732 km，优于其他基线方法。最重要的是，所有建议的技术模块都有助于预期的性能改进。此外，还可以解决多机规划和实时性因素，提高了方法的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning human-like decisions for flight approach: Environment and an imitation learning method

¹ Flight approach in terminal airspace is a challenging task with high aircraft density and maneuvering in air traffic control decisions. Existing reinforcement learning methods were only studied based on simulation environments, and also suffered from sparse reward and state-space explosion problems. In this work, an imitation learning-based autonomous framework, AppGAIL, is proposed to achieve the flight approach decision based on human expert demonstrations, which has the ability to eliminate the requirement of designing handcrafted rewards. To cope with the state-space explosion problem, a cylindrical grid airspace model is designed to convert the earth space to discrete airspace, obtaining the transformation of the real-time traffic situation by near distance–identical cells. The generative adversarial mechanism is applied to achieve imitation learning by distinguishing the source of the input observations (state and action sequences with a sliding window), i.e., from the generator or expert demonstrations. Since all human expert demonstrations are safe operations, limiting the model to learn knowledge of flight conflicts and confusing the generator to plan conflict trajectories, a conflict-aware discriminator is proposed to detect possible conflicts by a multi-task framework with learnable weights, which further supports the adversarial training. The real-world traffic dataset is applied to validate the proposed method, in which several custom metrics are proposed to support the real-world air traffic control. The experimental results demonstrate that the AppGAIL outperforms other baseline methods, achieving only 0.67% potential conflict rate and 3.732 kilometers dynamic time wrapping distance. Most importantly, all proposed technical modules contribute the desired performance improvement. Additionally, multi-aircraft planning and real-time factors can also be resolved to improve the applicability of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transportation Research Part C-Emerging Technologies 工程技术-运输科技

CiteScore

15.80

自引率

12.00%

发文量

332

审稿时长

64 days

期刊介绍： Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.