{"title":"基于离线强化学习策略的多约束双目标无人水面舰艇调度元启发式算法","authors":"Wuze Huang , Kaizhou Gao , Naiqi Wu , Liang Zhao , Renato Tinós","doi":"10.1016/j.swevo.2025.102159","DOIUrl":null,"url":null,"abstract":"<div><div>This study proposes a reinforcement learning-guided meta-heuristics framework for bi-objective unmanned surface vessel (USV) scheduling problems under complex marine constraints, aiming to minimize the maximum completion time and total collision risk index, simultaneously. First, to specify the problems, a bi-objective mathematical model is developed considering three constraints, battery capacity, marine obstacles, and uncertain task executing time. Second, four meta-heuristics are used and improved to solve the focused problems. Based on the problem features, seven local search operators are designed to enhance the algorithms’ performances. Third, two state-reward strategies are designed and integrated into Q-learning and SARSA, respectively, to form four reinforcement learning (RL) algorithms. The four RL algorithms are off-line trained and employed to select the optimal local search operator during the iteration of meta-heuristics for improving the search efficiency. Finally, the study evaluates the performances of the proposed algorithms on 10 cases with different scales. The experimental results and statistical tests verify the efficiency of the local search operators. It is demonstrated that the four proposed RL algorithms can further improve algorithms’ performances. The particle swarm optimization (PSO) integrating Q-learning with the second state-reward strategy (PSO_QL2) exhibits the best competitiveness among all compared algorithms.</div></div>","PeriodicalId":48682,"journal":{"name":"Swarm and Evolutionary Computation","volume":"99 ","pages":"Article 102159"},"PeriodicalIF":8.5000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Offline reinforcement learning strategies guided meta-heuristics for scheduling bi-objective unmanned surface vessel problems with multiple constraints\",\"authors\":\"Wuze Huang , Kaizhou Gao , Naiqi Wu , Liang Zhao , Renato Tinós\",\"doi\":\"10.1016/j.swevo.2025.102159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study proposes a reinforcement learning-guided meta-heuristics framework for bi-objective unmanned surface vessel (USV) scheduling problems under complex marine constraints, aiming to minimize the maximum completion time and total collision risk index, simultaneously. First, to specify the problems, a bi-objective mathematical model is developed considering three constraints, battery capacity, marine obstacles, and uncertain task executing time. Second, four meta-heuristics are used and improved to solve the focused problems. Based on the problem features, seven local search operators are designed to enhance the algorithms’ performances. Third, two state-reward strategies are designed and integrated into Q-learning and SARSA, respectively, to form four reinforcement learning (RL) algorithms. The four RL algorithms are off-line trained and employed to select the optimal local search operator during the iteration of meta-heuristics for improving the search efficiency. Finally, the study evaluates the performances of the proposed algorithms on 10 cases with different scales. The experimental results and statistical tests verify the efficiency of the local search operators. It is demonstrated that the four proposed RL algorithms can further improve algorithms’ performances. The particle swarm optimization (PSO) integrating Q-learning with the second state-reward strategy (PSO_QL2) exhibits the best competitiveness among all compared algorithms.</div></div>\",\"PeriodicalId\":48682,\"journal\":{\"name\":\"Swarm and Evolutionary Computation\",\"volume\":\"99 \",\"pages\":\"Article 102159\"},\"PeriodicalIF\":8.5000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Swarm and Evolutionary Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210650225003165\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Swarm and Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210650225003165","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Offline reinforcement learning strategies guided meta-heuristics for scheduling bi-objective unmanned surface vessel problems with multiple constraints
This study proposes a reinforcement learning-guided meta-heuristics framework for bi-objective unmanned surface vessel (USV) scheduling problems under complex marine constraints, aiming to minimize the maximum completion time and total collision risk index, simultaneously. First, to specify the problems, a bi-objective mathematical model is developed considering three constraints, battery capacity, marine obstacles, and uncertain task executing time. Second, four meta-heuristics are used and improved to solve the focused problems. Based on the problem features, seven local search operators are designed to enhance the algorithms’ performances. Third, two state-reward strategies are designed and integrated into Q-learning and SARSA, respectively, to form four reinforcement learning (RL) algorithms. The four RL algorithms are off-line trained and employed to select the optimal local search operator during the iteration of meta-heuristics for improving the search efficiency. Finally, the study evaluates the performances of the proposed algorithms on 10 cases with different scales. The experimental results and statistical tests verify the efficiency of the local search operators. It is demonstrated that the four proposed RL algorithms can further improve algorithms’ performances. The particle swarm optimization (PSO) integrating Q-learning with the second state-reward strategy (PSO_QL2) exhibits the best competitiveness among all compared algorithms.
期刊介绍:
Swarm and Evolutionary Computation is a pioneering peer-reviewed journal focused on the latest research and advancements in nature-inspired intelligent computation using swarm and evolutionary algorithms. It covers theoretical, experimental, and practical aspects of these paradigms and their hybrids, promoting interdisciplinary research. The journal prioritizes the publication of high-quality, original articles that push the boundaries of evolutionary computation and swarm intelligence. Additionally, it welcomes survey papers on current topics and novel applications. Topics of interest include but are not limited to: Genetic Algorithms, and Genetic Programming, Evolution Strategies, and Evolutionary Programming, Differential Evolution, Artificial Immune Systems, Particle Swarms, Ant Colony, Bacterial Foraging, Artificial Bees, Fireflies Algorithm, Harmony Search, Artificial Life, Digital Organisms, Estimation of Distribution Algorithms, Stochastic Diffusion Search, Quantum Computing, Nano Computing, Membrane Computing, Human-centric Computing, Hybridization of Algorithms, Memetic Computing, Autonomic Computing, Self-organizing systems, Combinatorial, Discrete, Binary, Constrained, Multi-objective, Multi-modal, Dynamic, and Large-scale Optimization.