A Reinforcement Learning-Based Bi-Population Nutcracker Optimizer for Global Optimization.

IF 3.9 3区医学 Q1 ENGINEERING, MULTIDISCIPLINARY

Biomimetics Pub Date : 2024-10-01 DOI:10.3390/biomimetics9100596

Yu Li, Yan Zhang

{"title":"A Reinforcement Learning-Based Bi-Population Nutcracker Optimizer for Global Optimization.","authors":"Yu Li, Yan Zhang","doi":"10.3390/biomimetics9100596","DOIUrl":null,"url":null,"abstract":"<p><p>The nutcracker optimizer algorithm (NOA) is a metaheuristic method proposed in recent years. This algorithm simulates the behavior of nutcrackers searching and storing food in nature to solve the optimization problem. However, the traditional NOA struggles to balance global exploration and local exploitation effectively, making it prone to getting trapped in local optima when solving complex problems. To address these shortcomings, this study proposes a reinforcement learning-based bi-population nutcracker optimizer algorithm called RLNOA. In the RLNOA, a bi-population mechanism is introduced to better balance global and local optimization capabilities. At the beginning of each iteration, the raw population is divided into an exploration sub-population and an exploitation sub-population based on the fitness value of each individual. The exploration sub-population is composed of individuals with poor fitness values. An improved foraging strategy based on random opposition-based learning is designed as the update method for the exploration sub-population to enhance diversity. Meanwhile, Q-learning serves as an adaptive selector for exploitation strategies, enabling optimal adjustment of the exploitation sub-population's behavior across various problems. The performance of the RLNOA is evaluated using the CEC-2014, CEC-2017, and CEC-2020 benchmark function sets, and it is compared against nine state-of-the-art metaheuristic algorithms. Experimental results demonstrate the superior performance of the proposed algorithm.</p>","PeriodicalId":8907,"journal":{"name":"Biomimetics","volume":"9 10","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11504337/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomimetics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/biomimetics9100596","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

The nutcracker optimizer algorithm (NOA) is a metaheuristic method proposed in recent years. This algorithm simulates the behavior of nutcrackers searching and storing food in nature to solve the optimization problem. However, the traditional NOA struggles to balance global exploration and local exploitation effectively, making it prone to getting trapped in local optima when solving complex problems. To address these shortcomings, this study proposes a reinforcement learning-based bi-population nutcracker optimizer algorithm called RLNOA. In the RLNOA, a bi-population mechanism is introduced to better balance global and local optimization capabilities. At the beginning of each iteration, the raw population is divided into an exploration sub-population and an exploitation sub-population based on the fitness value of each individual. The exploration sub-population is composed of individuals with poor fitness values. An improved foraging strategy based on random opposition-based learning is designed as the update method for the exploration sub-population to enhance diversity. Meanwhile, Q-learning serves as an adaptive selector for exploitation strategies, enabling optimal adjustment of the exploitation sub-population's behavior across various problems. The performance of the RLNOA is evaluated using the CEC-2014, CEC-2017, and CEC-2020 benchmark function sets, and it is compared against nine state-of-the-art metaheuristic algorithms. Experimental results demonstrate the superior performance of the proposed algorithm.

查看原文本刊更多论文

基于强化学习的双人口胡桃夹子全局优化器

胡桃钳优化算法（NOA）是近年来提出的一种元启发式方法。该算法模拟自然界中胡桃夹子寻找和储存食物的行为来解决优化问题。然而，传统的 NOA 难以有效平衡全局探索和局部开发，因此在解决复杂问题时容易陷入局部最优状态。针对这些不足，本研究提出了一种基于强化学习的双种群坚果钳优化算法，即 RLNOA。在 RLNOA 中，引入了一种双种群机制，以更好地平衡全局和局部优化能力。在每次迭代开始时，原始种群会根据每个个体的适应度值分为探索子种群和开发子种群。探索子种群由体能值较低的个体组成。为了提高多样性，设计了一种基于随机对立学习的改进觅食策略，作为探索子群体的更新方法。同时，Q-learning 作为开发策略的自适应选择器，可以在各种问题中对开发子群的行为进行优化调整。利用 CEC-2014、CEC-2017 和 CEC-2020 基准函数集对 RLNOA 的性能进行了评估，并与九种最先进的元启发式算法进行了比较。实验结果表明，该算法性能优越。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊