{"title":"基于算子选择和经验过滤的策略进化强化学习。","authors":"Kaitong Zheng,Ya-Hui Jia,Kejiang Ye,Wei-Neng Chen","doi":"10.1109/tnnls.2025.3596553","DOIUrl":null,"url":null,"abstract":"The shared replay buffer is the core of synergy in evolutionary reinforcement learning (ERL). Existing methods overlooked the objective conflict between population evolution in evolutionary algorithm and ERL, leading to poor quality of the replay buffer. In this article, we propose a strategic ERL algorithm with operator selection and experience filter (SERL-OS-EF) to address the objective conflict issue and improve the synergy from three aspects: 1) an operator selection strategy is proposed to enhance the performance of all individuals, thereby fundamentally improving the quality of experiences generated by the population; 2) an experience filter is introduced to filter the experiences obtained from the population, maintaining the long-term high quality of the buffer; and 3) a dynamic mixed sampling strategy is introduced to improve the efficiency of RL agent learning from the buffer. Experiments in four MuJoCo locomotion environments and three Ant-Maze environments with deceptive rewards demonstrate the superiority of the proposed method. In addition, the practical significance of the proposed method is verified on a low-carbon multienergy microgrid (MEMG) energy management task.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"24 1","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Strategic Evolutionary Reinforcement Learning With Operator Selection and Experience Filter.\",\"authors\":\"Kaitong Zheng,Ya-Hui Jia,Kejiang Ye,Wei-Neng Chen\",\"doi\":\"10.1109/tnnls.2025.3596553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The shared replay buffer is the core of synergy in evolutionary reinforcement learning (ERL). Existing methods overlooked the objective conflict between population evolution in evolutionary algorithm and ERL, leading to poor quality of the replay buffer. In this article, we propose a strategic ERL algorithm with operator selection and experience filter (SERL-OS-EF) to address the objective conflict issue and improve the synergy from three aspects: 1) an operator selection strategy is proposed to enhance the performance of all individuals, thereby fundamentally improving the quality of experiences generated by the population; 2) an experience filter is introduced to filter the experiences obtained from the population, maintaining the long-term high quality of the buffer; and 3) a dynamic mixed sampling strategy is introduced to improve the efficiency of RL agent learning from the buffer. Experiments in four MuJoCo locomotion environments and three Ant-Maze environments with deceptive rewards demonstrate the superiority of the proposed method. In addition, the practical significance of the proposed method is verified on a low-carbon multienergy microgrid (MEMG) energy management task.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tnnls.2025.3596553\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3596553","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Strategic Evolutionary Reinforcement Learning With Operator Selection and Experience Filter.
The shared replay buffer is the core of synergy in evolutionary reinforcement learning (ERL). Existing methods overlooked the objective conflict between population evolution in evolutionary algorithm and ERL, leading to poor quality of the replay buffer. In this article, we propose a strategic ERL algorithm with operator selection and experience filter (SERL-OS-EF) to address the objective conflict issue and improve the synergy from three aspects: 1) an operator selection strategy is proposed to enhance the performance of all individuals, thereby fundamentally improving the quality of experiences generated by the population; 2) an experience filter is introduced to filter the experiences obtained from the population, maintaining the long-term high quality of the buffer; and 3) a dynamic mixed sampling strategy is introduced to improve the efficiency of RL agent learning from the buffer. Experiments in four MuJoCo locomotion environments and three Ant-Maze environments with deceptive rewards demonstrate the superiority of the proposed method. In addition, the practical significance of the proposed method is verified on a low-carbon multienergy microgrid (MEMG) energy management task.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.