Lucas Spangher, Akash Gokul, Manan Khattar, Joseph Palakapilly, A. Tawade, Adam Bouyamourn, Alex Devonport, C. Spanos
{"title":"Prospective Experiment for Reinforcement Learning on Demand Response in a Social Game Framework","authors":"Lucas Spangher, Akash Gokul, Manan Khattar, Joseph Palakapilly, A. Tawade, Adam Bouyamourn, Alex Devonport, C. Spanos","doi":"10.1145/3396851.3402365","DOIUrl":null,"url":null,"abstract":"Improving demand response can help optimize renewable energy use and might be possible using current tools in machine learning. We propose an experiment to test the development of Reinforcement Learning (RL) agents to learn to vary a daily grid price signal to optimize behavioral energy shift in office workers. We describe our application of Batch Constrained Q Learning and Soft Actor Critic (SAC) as RL agents and Social Cognitive Theory, LSTM networks, and linear regression as planning models. We report limited success within simulation with SAC and linear regression. Finally, we propose an experiment timeline for consideration.","PeriodicalId":442966,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Future Energy Systems","volume":"36 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eleventh ACM International Conference on Future Energy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3396851.3402365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Improving demand response can help optimize renewable energy use and might be possible using current tools in machine learning. We propose an experiment to test the development of Reinforcement Learning (RL) agents to learn to vary a daily grid price signal to optimize behavioral energy shift in office workers. We describe our application of Batch Constrained Q Learning and Soft Actor Critic (SAC) as RL agents and Social Cognitive Theory, LSTM networks, and linear regression as planning models. We report limited success within simulation with SAC and linear regression. Finally, we propose an experiment timeline for consideration.