{"title":"Distributed Ensembles of Reinforcement Learning Agents for Electricity Control","authors":"Pierrick Pochelu, B. Conche, S. Petiton","doi":"10.1109/COINS54846.2022.9854987","DOIUrl":null,"url":null,"abstract":"Deep Reinforcement Learning (or just \"RL\") is gaining popularity for industrial and research applications. However, it still suffers from some key limits slowing down its widespread adoption. Its performance is sensitive to initial conditions and non-determinism. To unlock those challenges, we propose a procedure to ensemble of RL agents based to efficiently build better local decisions towards long-term cumulated rewards. For the first time, hundreds of experiments have been done to compare different ensemble constructions procedure on 2 electricity control environments. We discovered an ensemble of 4 agents improves accumulated rewards by 36% in average, improve stability by factor 2.05 and can naturally and efficiently trained and predicted in parallel on GPUs and CPUs.","PeriodicalId":187055,"journal":{"name":"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COINS54846.2022.9854987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Deep Reinforcement Learning (or just "RL") is gaining popularity for industrial and research applications. However, it still suffers from some key limits slowing down its widespread adoption. Its performance is sensitive to initial conditions and non-determinism. To unlock those challenges, we propose a procedure to ensemble of RL agents based to efficiently build better local decisions towards long-term cumulated rewards. For the first time, hundreds of experiments have been done to compare different ensemble constructions procedure on 2 electricity control environments. We discovered an ensemble of 4 agents improves accumulated rewards by 36% in average, improve stability by factor 2.05 and can naturally and efficiently trained and predicted in parallel on GPUs and CPUs.