{"title":"电力控制中的分布式强化学习智能体集成","authors":"Pierrick Pochelu, B. Conche, S. Petiton","doi":"10.1109/COINS54846.2022.9854987","DOIUrl":null,"url":null,"abstract":"Deep Reinforcement Learning (or just \"RL\") is gaining popularity for industrial and research applications. However, it still suffers from some key limits slowing down its widespread adoption. Its performance is sensitive to initial conditions and non-determinism. To unlock those challenges, we propose a procedure to ensemble of RL agents based to efficiently build better local decisions towards long-term cumulated rewards. For the first time, hundreds of experiments have been done to compare different ensemble constructions procedure on 2 electricity control environments. We discovered an ensemble of 4 agents improves accumulated rewards by 36% in average, improve stability by factor 2.05 and can naturally and efficiently trained and predicted in parallel on GPUs and CPUs.","PeriodicalId":187055,"journal":{"name":"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Distributed Ensembles of Reinforcement Learning Agents for Electricity Control\",\"authors\":\"Pierrick Pochelu, B. Conche, S. Petiton\",\"doi\":\"10.1109/COINS54846.2022.9854987\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Reinforcement Learning (or just \\\"RL\\\") is gaining popularity for industrial and research applications. However, it still suffers from some key limits slowing down its widespread adoption. Its performance is sensitive to initial conditions and non-determinism. To unlock those challenges, we propose a procedure to ensemble of RL agents based to efficiently build better local decisions towards long-term cumulated rewards. For the first time, hundreds of experiments have been done to compare different ensemble constructions procedure on 2 electricity control environments. We discovered an ensemble of 4 agents improves accumulated rewards by 36% in average, improve stability by factor 2.05 and can naturally and efficiently trained and predicted in parallel on GPUs and CPUs.\",\"PeriodicalId\":187055,\"journal\":{\"name\":\"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COINS54846.2022.9854987\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COINS54846.2022.9854987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Distributed Ensembles of Reinforcement Learning Agents for Electricity Control
Deep Reinforcement Learning (or just "RL") is gaining popularity for industrial and research applications. However, it still suffers from some key limits slowing down its widespread adoption. Its performance is sensitive to initial conditions and non-determinism. To unlock those challenges, we propose a procedure to ensemble of RL agents based to efficiently build better local decisions towards long-term cumulated rewards. For the first time, hundreds of experiments have been done to compare different ensemble constructions procedure on 2 electricity control environments. We discovered an ensemble of 4 agents improves accumulated rewards by 36% in average, improve stability by factor 2.05 and can naturally and efficiently trained and predicted in parallel on GPUs and CPUs.