Jupiter Bakakeu, Dominik Kißkalt, J. Franke, S. Baer, H. Klos, J. Peschke
{"title":"面向信息物理生产系统能量优化的多智能体强化学习","authors":"Jupiter Bakakeu, Dominik Kißkalt, J. Franke, S. Baer, H. Klos, J. Peschke","doi":"10.1109/CCECE47787.2020.9255795","DOIUrl":null,"url":null,"abstract":"The paper proposes an artificial intelligence-based solution for the efficient operation of a heterogeneous cluster of flexible manufacturing machines with energy generation and storage capabilities in an electricity micro-grid featuring high volatility of electricity prices. The problem of finding the optimal control policy is first formulated as a game-theoretic sequential decision-making problem under uncertainty, where at every time step the uncertainty is characterized by future weather-dependent energy prices, high demand fluctuation, as well as random unexpected disturbances on the factory floor. Because of the parallel interaction of the machines with the grid, the local viewpoints of an agent are non-stationary and non-Markovian. Therefore, traditional methods such as standard reinforcement learning approaches that learn a specialized policy for a single machine are not applicable. To address this problem, we propose a multi-agent actor-critic method that takes into account the policies of other participants to achieve explicit coordination between a large numbers of actors. We show the strength of our approach in mixed cooperative and competitive scenarios where different production machines were able to discover different coordination strategies in order to increase the energy efficiency of the whole factory floor.","PeriodicalId":296506,"journal":{"name":"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)","volume":"198 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Multi-Agent Reinforcement Learning for the Energy Optimization of Cyber-Physical Production Systems\",\"authors\":\"Jupiter Bakakeu, Dominik Kißkalt, J. Franke, S. Baer, H. Klos, J. Peschke\",\"doi\":\"10.1109/CCECE47787.2020.9255795\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper proposes an artificial intelligence-based solution for the efficient operation of a heterogeneous cluster of flexible manufacturing machines with energy generation and storage capabilities in an electricity micro-grid featuring high volatility of electricity prices. The problem of finding the optimal control policy is first formulated as a game-theoretic sequential decision-making problem under uncertainty, where at every time step the uncertainty is characterized by future weather-dependent energy prices, high demand fluctuation, as well as random unexpected disturbances on the factory floor. Because of the parallel interaction of the machines with the grid, the local viewpoints of an agent are non-stationary and non-Markovian. Therefore, traditional methods such as standard reinforcement learning approaches that learn a specialized policy for a single machine are not applicable. To address this problem, we propose a multi-agent actor-critic method that takes into account the policies of other participants to achieve explicit coordination between a large numbers of actors. We show the strength of our approach in mixed cooperative and competitive scenarios where different production machines were able to discover different coordination strategies in order to increase the energy efficiency of the whole factory floor.\",\"PeriodicalId\":296506,\"journal\":{\"name\":\"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"volume\":\"198 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCECE47787.2020.9255795\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCECE47787.2020.9255795","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-Agent Reinforcement Learning for the Energy Optimization of Cyber-Physical Production Systems
The paper proposes an artificial intelligence-based solution for the efficient operation of a heterogeneous cluster of flexible manufacturing machines with energy generation and storage capabilities in an electricity micro-grid featuring high volatility of electricity prices. The problem of finding the optimal control policy is first formulated as a game-theoretic sequential decision-making problem under uncertainty, where at every time step the uncertainty is characterized by future weather-dependent energy prices, high demand fluctuation, as well as random unexpected disturbances on the factory floor. Because of the parallel interaction of the machines with the grid, the local viewpoints of an agent are non-stationary and non-Markovian. Therefore, traditional methods such as standard reinforcement learning approaches that learn a specialized policy for a single machine are not applicable. To address this problem, we propose a multi-agent actor-critic method that takes into account the policies of other participants to achieve explicit coordination between a large numbers of actors. We show the strength of our approach in mixed cooperative and competitive scenarios where different production machines were able to discover different coordination strategies in order to increase the energy efficiency of the whole factory floor.