Magnus Tarle, Mårten Björkman, M. Larsson, L. Nordström, G. Ingeström
{"title":"A World Model Based Reinforcement Learning Architecture for Autonomous Power System Control","authors":"Magnus Tarle, Mårten Björkman, M. Larsson, L. Nordström, G. Ingeström","doi":"10.1109/SmartGridComm51999.2021.9632332","DOIUrl":null,"url":null,"abstract":"Renewable generation is leading to rapidly shifting power flows and it is anticipated that traditional power system control may soon be inadequate to cope with these fluctuations. Traditional control include human-in-the-loop-control schemes while more autonomous control methods can be categorized into Wide-Area Monitoring, Protection and Control systems (WAMPAC). Within this latter group of more advanced systems, reinforcement learning (RL) is a potential candidate to facilitate power system control facing these new challenges. In this paper we demonstrate how a model based reinforcement learning (MBRL) algorithm, which learns and uses an internal model of the world, can be used for autonomous power system control. The proposed RL agent, called the World Model for Autonomous Power System Control (WMAP), includes a safety shield to minimize risk of poor decisions at high uncertainty. The shield can be configured to permit WMAP to take actions with the condition that WMAP asks for guidance, e.g. from a human operator, when in doubt. As an alternative, WMAP could be run in full decision support mode which would require the operator to take all the active decisions. A case study is performed on a IEEE 14-bus system where WMAP is setup to control setpoints of two FACTS devices to emulate grid stability improvements. Results show that improved grid stability is achieved using WMAP while staying within voltage limits. Furthermore, a disastrous situation is avoided when WMAP asks for help in a test scenario event that it had not been trained for.","PeriodicalId":378884,"journal":{"name":"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SmartGridComm51999.2021.9632332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Renewable generation is leading to rapidly shifting power flows and it is anticipated that traditional power system control may soon be inadequate to cope with these fluctuations. Traditional control include human-in-the-loop-control schemes while more autonomous control methods can be categorized into Wide-Area Monitoring, Protection and Control systems (WAMPAC). Within this latter group of more advanced systems, reinforcement learning (RL) is a potential candidate to facilitate power system control facing these new challenges. In this paper we demonstrate how a model based reinforcement learning (MBRL) algorithm, which learns and uses an internal model of the world, can be used for autonomous power system control. The proposed RL agent, called the World Model for Autonomous Power System Control (WMAP), includes a safety shield to minimize risk of poor decisions at high uncertainty. The shield can be configured to permit WMAP to take actions with the condition that WMAP asks for guidance, e.g. from a human operator, when in doubt. As an alternative, WMAP could be run in full decision support mode which would require the operator to take all the active decisions. A case study is performed on a IEEE 14-bus system where WMAP is setup to control setpoints of two FACTS devices to emulate grid stability improvements. Results show that improved grid stability is achieved using WMAP while staying within voltage limits. Furthermore, a disastrous situation is avoided when WMAP asks for help in a test scenario event that it had not been trained for.