{"title":"Metaheuristic-based weight optimization for robust deep reinforcement learning in continuous control","authors":"Gwang-Jong Ko , Jaeseok Huh","doi":"10.1016/j.swevo.2025.101920","DOIUrl":null,"url":null,"abstract":"<div><div>In recent studies, the policy-based deep reinforcement learning (DRL) algorithms have exhibited superior performance in addressing continuous control problems, such as machine arms control and robot gait learning. However, these algorithms frequently face challenges inherent in gradient descent-based weight optimization methods, including susceptibility to local optima, slow learning speeds due to saddle points, approximation errors, and suboptimal hyperparameters. This instability leads to significant performance discrepancies among agent instances trained under identical settings, which complicates the practical application of reinforcement learning. To address this, we propose a metaheuristic-based weight optimization framework designed to mitigate learning instability in DRL for continuous control tasks. The proposed framework introduces a two-phase optimization process, where an additional search phase using swarm intelligence algorithms is conducted at the end of the learning phase utilizing DRL. In numerical experiments, the proposed framework demonstrated superior and more stable performance compared to conventional DRL algorithms in robot locomotion tasks.</div></div>","PeriodicalId":48682,"journal":{"name":"Swarm and Evolutionary Computation","volume":"95 ","pages":"Article 101920"},"PeriodicalIF":8.2000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Swarm and Evolutionary Computation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210650225000781","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In recent studies, the policy-based deep reinforcement learning (DRL) algorithms have exhibited superior performance in addressing continuous control problems, such as machine arms control and robot gait learning. However, these algorithms frequently face challenges inherent in gradient descent-based weight optimization methods, including susceptibility to local optima, slow learning speeds due to saddle points, approximation errors, and suboptimal hyperparameters. This instability leads to significant performance discrepancies among agent instances trained under identical settings, which complicates the practical application of reinforcement learning. To address this, we propose a metaheuristic-based weight optimization framework designed to mitigate learning instability in DRL for continuous control tasks. The proposed framework introduces a two-phase optimization process, where an additional search phase using swarm intelligence algorithms is conducted at the end of the learning phase utilizing DRL. In numerical experiments, the proposed framework demonstrated superior and more stable performance compared to conventional DRL algorithms in robot locomotion tasks.
期刊介绍:
Swarm and Evolutionary Computation is a pioneering peer-reviewed journal focused on the latest research and advancements in nature-inspired intelligent computation using swarm and evolutionary algorithms. It covers theoretical, experimental, and practical aspects of these paradigms and their hybrids, promoting interdisciplinary research. The journal prioritizes the publication of high-quality, original articles that push the boundaries of evolutionary computation and swarm intelligence. Additionally, it welcomes survey papers on current topics and novel applications. Topics of interest include but are not limited to: Genetic Algorithms, and Genetic Programming, Evolution Strategies, and Evolutionary Programming, Differential Evolution, Artificial Immune Systems, Particle Swarms, Ant Colony, Bacterial Foraging, Artificial Bees, Fireflies Algorithm, Harmony Search, Artificial Life, Digital Organisms, Estimation of Distribution Algorithms, Stochastic Diffusion Search, Quantum Computing, Nano Computing, Membrane Computing, Human-centric Computing, Hybridization of Algorithms, Memetic Computing, Autonomic Computing, Self-organizing systems, Combinatorial, Discrete, Binary, Constrained, Multi-objective, Multi-modal, Dynamic, and Large-scale Optimization.