A reinforcement learning-based evolutionary algorithm for the unmanned aerial vehicles maritime search and rescue path planning problem considering multiple rescue centers
IF 3.3 2区 计算机科学Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
{"title":"A reinforcement learning-based evolutionary algorithm for the unmanned aerial vehicles maritime search and rescue path planning problem considering multiple rescue centers","authors":"Haowen Zhan, Yue Zhang, Jingbo Huang, Yanjie Song, Lining Xing, Jie Wu, Zengyun Gao","doi":"10.1007/s12293-024-00420-8","DOIUrl":null,"url":null,"abstract":"<p>In the realm of maritime emergencies, unmanned aerial vehicles (UAVs) play a crucial role in enhancing search and rescue (SAR) operations. They help in efficiently rescuing distressed crews, strengthening maritime surveillance, and maintaining national security due to their cost-effectiveness, versatility, and effectiveness. However, the vast expanse of sea territories and the rapid changes in maritime conditions make a single SAR center insufficient for handling complex emergencies. Thus, it is vital to develop strategies for quickly deploying UAV resources from multiple SAR centers for area reconnaissance and supporting maritime rescue operations. This study introduces a graph-structured planning model for the maritime SAR path planning problem, considering multiple rescue centers (MSARPPP-MRC). It incorporates workload distribution among SAR centers and UAV operational constraints. We propose a reinforcement learning-based genetic algorithm (GA-RL) to tackle the MSARPPP-MRC problem. GA-RL uses heuristic rules to initialize the population and employs the Q-learning method to manage the progeny during each generation, including their retention, storage, or disposal. When the elite repository’s capacity is reached, a decision is made on the utilization of these members to refresh the population. Additionally, adaptive crossover and perturbation strategies are applied to develop a more effective SAR scheme. Extensive testing proves that GA-RL surpasses other algorithms in optimization efficacy and efficiency, highlighting the benefits of reinforcement learning in population management.</p>","PeriodicalId":48780,"journal":{"name":"Memetic Computing","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Memetic Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12293-024-00420-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the realm of maritime emergencies, unmanned aerial vehicles (UAVs) play a crucial role in enhancing search and rescue (SAR) operations. They help in efficiently rescuing distressed crews, strengthening maritime surveillance, and maintaining national security due to their cost-effectiveness, versatility, and effectiveness. However, the vast expanse of sea territories and the rapid changes in maritime conditions make a single SAR center insufficient for handling complex emergencies. Thus, it is vital to develop strategies for quickly deploying UAV resources from multiple SAR centers for area reconnaissance and supporting maritime rescue operations. This study introduces a graph-structured planning model for the maritime SAR path planning problem, considering multiple rescue centers (MSARPPP-MRC). It incorporates workload distribution among SAR centers and UAV operational constraints. We propose a reinforcement learning-based genetic algorithm (GA-RL) to tackle the MSARPPP-MRC problem. GA-RL uses heuristic rules to initialize the population and employs the Q-learning method to manage the progeny during each generation, including their retention, storage, or disposal. When the elite repository’s capacity is reached, a decision is made on the utilization of these members to refresh the population. Additionally, adaptive crossover and perturbation strategies are applied to develop a more effective SAR scheme. Extensive testing proves that GA-RL surpasses other algorithms in optimization efficacy and efficiency, highlighting the benefits of reinforcement learning in population management.
在海上紧急情况领域,无人驾驶飞行器(UAV)在加强搜救(SAR)行动方面发挥着至关重要的作用。由于其成本效益高、用途广泛且效果显著,它们有助于有效营救遇险船员、加强海上监视和维护国家安全。然而,幅员辽阔的海域和瞬息万变的海况使得单一的 SAR 中心不足以应对复杂的紧急情况。因此,制定从多个 SAR 中心快速部署无人机资源的战略,用于区域侦察和支持海上救援行动至关重要。本研究针对考虑多个救援中心的海上搜救路径规划问题引入了图结构规划模型(MSARPPP-MRC)。该模型纳入了 SAR 中心之间的工作量分配和无人机操作约束。我们提出了一种基于强化学习的遗传算法(GA-RL)来解决 MSARPPP-MRC 问题。GA-RL 使用启发式规则来初始化种群,并采用 Q-learning 方法来管理每一代的后代,包括保留、存储或处置。当精英库的容量达到一定程度时,就会决定是否利用这些成员来刷新种群。此外,还应用了自适应交叉和扰动策略来开发更有效的 SAR 方案。广泛的测试证明,GA-RL 在优化效果和效率方面超越了其他算法,突出了强化学习在种群管理方面的优势。
Memetic ComputingCOMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-OPERATIONS RESEARCH & MANAGEMENT SCIENCE
CiteScore
6.80
自引率
12.80%
发文量
31
期刊介绍:
Memes have been defined as basic units of transferrable information that reside in the brain and are propagated across populations through the process of imitation. From an algorithmic point of view, memes have come to be regarded as building-blocks of prior knowledge, expressed in arbitrary computational representations (e.g., local search heuristics, fuzzy rules, neural models, etc.), that have been acquired through experience by a human or machine, and can be imitated (i.e., reused) across problems.
The Memetic Computing journal welcomes papers incorporating the aforementioned socio-cultural notion of memes into artificial systems, with particular emphasis on enhancing the efficacy of computational and artificial intelligence techniques for search, optimization, and machine learning through explicit prior knowledge incorporation. The goal of the journal is to thus be an outlet for high quality theoretical and applied research on hybrid, knowledge-driven computational approaches that may be characterized under any of the following categories of memetics:
Type 1: General-purpose algorithms integrated with human-crafted heuristics that capture some form of prior domain knowledge; e.g., traditional memetic algorithms hybridizing evolutionary global search with a problem-specific local search.
Type 2: Algorithms with the ability to automatically select, adapt, and reuse the most appropriate heuristics from a diverse pool of available choices; e.g., learning a mapping between global search operators and multiple local search schemes, given an optimization problem at hand.
Type 3: Algorithms that autonomously learn with experience, adaptively reusing data and/or machine learning models drawn from related problems as prior knowledge in new target tasks of interest; examples include, but are not limited to, transfer learning and optimization, multi-task learning and optimization, or any other multi-X evolutionary learning and optimization methodologies.