基于强化学习的电器调度方法比较

2016 2nd International Conference on Contemporary Computing and Informatics (IC3I) Pub Date : 2016-12-01 DOI:10.1109/IC3I.2016.7917970

Namit Chauhan, Neha Choudhary, K. George

{"title":"基于强化学习的电器调度方法比较","authors":"Namit Chauhan, Neha Choudhary, K. George","doi":"10.1109/IC3I.2016.7917970","DOIUrl":null,"url":null,"abstract":"Reinforcement learning is often proposed as a technique for intelligent control in a smart home setup with dynamic real-time energy pricing and advanced sub-metering infrastructure. In this paper, we introduce a variation of State Action Reward State Action (SARSA) as an optimization algorithm for appliance scheduling in smart homes with multiple appliances and compare it with the popular reinforcement learning method Q-learning. A simple, intuitive and unique treelike Markov decision process (MDP) structure of appliances is proposed which takes into account the states, such as on/off/runtime status, of all schedulable appliances but does not require the knowledge of the state to state transition probabilities.","PeriodicalId":305971,"journal":{"name":"2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"A comparison of reinforcement learning based approaches to appliance scheduling\",\"authors\":\"Namit Chauhan, Neha Choudhary, K. George\",\"doi\":\"10.1109/IC3I.2016.7917970\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning is often proposed as a technique for intelligent control in a smart home setup with dynamic real-time energy pricing and advanced sub-metering infrastructure. In this paper, we introduce a variation of State Action Reward State Action (SARSA) as an optimization algorithm for appliance scheduling in smart homes with multiple appliances and compare it with the popular reinforcement learning method Q-learning. A simple, intuitive and unique treelike Markov decision process (MDP) structure of appliances is proposed which takes into account the states, such as on/off/runtime status, of all schedulable appliances but does not require the knowledge of the state to state transition probabilities.\",\"PeriodicalId\":305971,\"journal\":{\"name\":\"2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3I.2016.7917970\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3I.2016.7917970","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

强化学习通常被提出作为智能家居设置中的智能控制技术，具有动态实时能源定价和先进的分计量基础设施。在本文中，我们引入了一种状态动作奖励状态动作(SARSA)的变体作为多家电智能家居中家电调度的优化算法，并将其与流行的强化学习方法Q-learning进行了比较。提出了一种简单、直观、独特的树状马尔可夫决策过程结构，该结构考虑了所有可调度设备的状态，如开/关/运行状态，但不需要知道状态到状态转移概率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A comparison of reinforcement learning based approaches to appliance scheduling

Reinforcement learning is often proposed as a technique for intelligent control in a smart home setup with dynamic real-time energy pricing and advanced sub-metering infrastructure. In this paper, we introduce a variation of State Action Reward State Action (SARSA) as an optimization algorithm for appliance scheduling in smart homes with multiple appliances and compare it with the popular reinforcement learning method Q-learning. A simple, intuitive and unique treelike Markov decision process (MDP) structure of appliances is proposed which takes into account the states, such as on/off/runtime status, of all schedulable appliances but does not require the knowledge of the state to state transition probabilities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)

自引率

0.00%

发文量