使用强化学习增强基于领导的元启发式：灰狼优化器的案例研究

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2025-09-16 DOI:10.1016/j.knosys.2025.114471

Afifeh Maleki , Mehdy Roayaei , Seyedali Mirjalili

{"title":"使用强化学习增强基于领导的元启发式：灰狼优化器的案例研究","authors":"Afifeh Maleki , Mehdy Roayaei , Seyedali Mirjalili","doi":"10.1016/j.knosys.2025.114471","DOIUrl":null,"url":null,"abstract":"<div><div>Metaheuristics are widely applied in optimization because of their flexibility and ability to address complex and high-dimensional problems. Nevertheless, they face persistent challenges, including susceptibility to local optima, limited parameter adaptability, and premature convergence. Leadership-based metaheuristics, in which leaders guide the search process, encounter additional difficulties such as limited exploration capacity, leader stagnation, and reduced diversity, often stemming from underutilization of data generated during the search. To overcome these limitations, this study proposes a reinforcement learning–based approach, RL-LGWO, which enhances the Grey Wolf Optimizer (GWO) by integrating multi-agent reinforcement learning. In RL-LGWO, agents share experiences to improve decision-making, and reinforcement learning is employed to decouple and adapt the leader update mechanism, thereby improving the exploration–exploitation balance and enabling leaders to dynamically escape local optima. The proposed method was evaluated against two GWO-enhancing algorithms, three RL-based GWO variants, PSO, WOA, and the original GWO across 23 well-known benchmark functions, in addition to the recent CEC2022 benchmark suite. Experimental results show that RL-LGWO achieved the best solutions on 17 of the 23 benchmark functions, with superior convergence speed and improved stability, while incurring only a minor runtime increase compared with the original GWO. Furthermore, on the CEC2022 suite, RL-LGWO outperformed competing algorithms on 10 of 12 test functions, underscoring its robustness and adaptability to recent and challenging benchmarks. Overall, the findings indicate that RL-LGWO delivers a substantive improvement over state-of-the-art alternatives and holds strong potential to advance leadership-based metaheuristics for a wide range of optimization problems.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114471"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing leadership-based metaheuristics using reinforcement learning: A case study in grey wolf optimizer\",\"authors\":\"Afifeh Maleki , Mehdy Roayaei , Seyedali Mirjalili\",\"doi\":\"10.1016/j.knosys.2025.114471\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Metaheuristics are widely applied in optimization because of their flexibility and ability to address complex and high-dimensional problems. Nevertheless, they face persistent challenges, including susceptibility to local optima, limited parameter adaptability, and premature convergence. Leadership-based metaheuristics, in which leaders guide the search process, encounter additional difficulties such as limited exploration capacity, leader stagnation, and reduced diversity, often stemming from underutilization of data generated during the search. To overcome these limitations, this study proposes a reinforcement learning–based approach, RL-LGWO, which enhances the Grey Wolf Optimizer (GWO) by integrating multi-agent reinforcement learning. In RL-LGWO, agents share experiences to improve decision-making, and reinforcement learning is employed to decouple and adapt the leader update mechanism, thereby improving the exploration–exploitation balance and enabling leaders to dynamically escape local optima. The proposed method was evaluated against two GWO-enhancing algorithms, three RL-based GWO variants, PSO, WOA, and the original GWO across 23 well-known benchmark functions, in addition to the recent CEC2022 benchmark suite. Experimental results show that RL-LGWO achieved the best solutions on 17 of the 23 benchmark functions, with superior convergence speed and improved stability, while incurring only a minor runtime increase compared with the original GWO. Furthermore, on the CEC2022 suite, RL-LGWO outperformed competing algorithms on 10 of 12 test functions, underscoring its robustness and adaptability to recent and challenging benchmarks. Overall, the findings indicate that RL-LGWO delivers a substantive improvement over state-of-the-art alternatives and holds strong potential to advance leadership-based metaheuristics for a wide range of optimization problems.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"330 \",\"pages\":\"Article 114471\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125015102\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015102","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

元启发式算法以其灵活性和处理复杂高维问题的能力在优化中得到广泛应用。然而，它们面临着持续的挑战，包括对局部最优的敏感性、有限的参数适应性和过早收敛。基于领导的元启发式方法（由领导引导搜索过程）遇到了额外的困难，如有限的探索能力、领导停滞和多样性减少，这些通常源于搜索过程中产生的数据利用不足。为了克服这些限制，本研究提出了一种基于强化学习的方法，RL-LGWO，该方法通过集成多智能体强化学习来增强灰狼优化器（GWO）。在RL-LGWO中，智能体通过分享经验来改进决策，通过强化学习来解耦和适应领导者更新机制，从而改善探索-开发平衡，使领导者能够动态逃离局部最优。除了最近的CEC2022基准测试套件外，还针对两种GWO增强算法、三种基于rl的GWO变体、PSO、WOA和23种知名基准函数的原始GWO进行了评估。实验结果表明，RL-LGWO在23个基准函数中的17个函数上获得了最佳解，收敛速度更快，稳定性也有所提高，运行时间比原始GWO只增加了少量。此外，在CEC2022套件上，RL-LGWO在12个测试功能中的10个测试中优于竞争算法，强调了其鲁棒性和对最新和具有挑战性的基准测试的适应性。总体而言，研究结果表明，RL-LGWO比最先进的替代方案有了实质性的改进，并具有推动基于领导力的元启发式解决广泛优化问题的强大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhancing leadership-based metaheuristics using reinforcement learning: A case study in grey wolf optimizer

Metaheuristics are widely applied in optimization because of their flexibility and ability to address complex and high-dimensional problems. Nevertheless, they face persistent challenges, including susceptibility to local optima, limited parameter adaptability, and premature convergence. Leadership-based metaheuristics, in which leaders guide the search process, encounter additional difficulties such as limited exploration capacity, leader stagnation, and reduced diversity, often stemming from underutilization of data generated during the search. To overcome these limitations, this study proposes a reinforcement learning–based approach, RL-LGWO, which enhances the Grey Wolf Optimizer (GWO) by integrating multi-agent reinforcement learning. In RL-LGWO, agents share experiences to improve decision-making, and reinforcement learning is employed to decouple and adapt the leader update mechanism, thereby improving the exploration–exploitation balance and enabling leaders to dynamically escape local optima. The proposed method was evaluated against two GWO-enhancing algorithms, three RL-based GWO variants, PSO, WOA, and the original GWO across 23 well-known benchmark functions, in addition to the recent CEC2022 benchmark suite. Experimental results show that RL-LGWO achieved the best solutions on 17 of the 23 benchmark functions, with superior convergence speed and improved stability, while incurring only a minor runtime increase compared with the original GWO. Furthermore, on the CEC2022 suite, RL-LGWO outperformed competing algorithms on 10 of 12 test functions, underscoring its robustness and adaptability to recent and challenging benchmarks. Overall, the findings indicate that RL-LGWO delivers a substantive improvement over state-of-the-art alternatives and holds strong potential to advance leadership-based metaheuristics for a wide range of optimization problems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.