Human-in-the-loop control strategy for IoT-based smart thermostats with Deep Reinforcement Learning

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Energy and AI Pub Date : 2025-03-15 DOI:10.1016/j.egyai.2025.100490

Payam Fatehi Karjou, Fabian Stupperich, Phillip Stoffel, Drk Müller

{"title":"Human-in-the-loop control strategy for IoT-based smart thermostats with Deep Reinforcement Learning","authors":"Payam Fatehi Karjou, Fabian Stupperich, Phillip Stoffel, Drk Müller","doi":"10.1016/j.egyai.2025.100490","DOIUrl":null,"url":null,"abstract":"<div><div>Thermostatic Radiator Valves (TRVs) are a widely used technology for regulating room heating in Europe countries. Smart TRVs can provide significant energy savings, often ranging from 20–40% compared to conventional heating systems. They use sensors and algorithms to learn user behavior and optimize heating schedules accordingly. They can often be easily retrofitted to existing heating systems, making them a practical option for enhancing energy efficiency in present buildings, especially in office buildings due to their highly dynamic operational patterns. This work presents a novel human-in-the-loop control strategy for Internet of Things (IoT)-based TRVs using Deep Reinforcement Learning (DRL). A key focus of this research is enhancing the adaptability of agents’ behavior by implementing a more generic and flexible Markov Decision Process (MDP) to promote policy generalization across diverse scenarios. The study explores the challenges of transferring control behaviors from simulation environments to real-world settings, examining the performance across different thermal zones and evaluating the integration flexibility of the control strategy within building systems. Real-world occupant behavior is incorporated, including dynamic comfort preferences and occupancy predictions, to better align thermostat operation with user preferences. Furthermore, this paper discusses the practical challenges encountered during implementation, including battery consumption of IoT devices, integration of occupancy detection and prediction systems, and maintenance requirements. By addressing these issues, the proposed control strategy seeks to improve the scalability and feasibility of IoT-based TRVs, thereby providing a viable solution for their widespread deployment in buildings.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"20 ","pages":"Article 100490"},"PeriodicalIF":9.6000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825000229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Thermostatic Radiator Valves (TRVs) are a widely used technology for regulating room heating in Europe countries. Smart TRVs can provide significant energy savings, often ranging from 20–40% compared to conventional heating systems. They use sensors and algorithms to learn user behavior and optimize heating schedules accordingly. They can often be easily retrofitted to existing heating systems, making them a practical option for enhancing energy efficiency in present buildings, especially in office buildings due to their highly dynamic operational patterns. This work presents a novel human-in-the-loop control strategy for Internet of Things (IoT)-based TRVs using Deep Reinforcement Learning (DRL). A key focus of this research is enhancing the adaptability of agents’ behavior by implementing a more generic and flexible Markov Decision Process (MDP) to promote policy generalization across diverse scenarios. The study explores the challenges of transferring control behaviors from simulation environments to real-world settings, examining the performance across different thermal zones and evaluating the integration flexibility of the control strategy within building systems. Real-world occupant behavior is incorporated, including dynamic comfort preferences and occupancy predictions, to better align thermostat operation with user preferences. Furthermore, this paper discusses the practical challenges encountered during implementation, including battery consumption of IoT devices, integration of occupancy detection and prediction systems, and maintenance requirements. By addressing these issues, the proposed control strategy seeks to improve the scalability and feasibility of IoT-based TRVs, thereby providing a viable solution for their widespread deployment in buildings.

Abstract Image

查看原文本刊更多论文

基于深度强化学习的物联网智能恒温器人在环控制策略

恒温散热器阀（TRVs）是欧洲各国广泛应用的室内采暖调节技术。与传统供暖系统相比，智能trv可以节省20-40%的能源。他们使用传感器和算法来学习用户行为，并相应地优化加热时间表。它们通常可以很容易地改造现有的供暖系统，使它们成为提高现有建筑物能源效率的实际选择，特别是在办公大楼中，因为它们的运行模式是高度动态的。本研究提出了一种基于深度强化学习（DRL）的基于物联网（IoT）的trv的新型人在环控制策略。本研究的一个重点是通过实现一个更通用和灵活的马尔可夫决策过程（MDP）来提高智能体行为的适应性，从而促进策略在不同场景下的泛化。该研究探讨了将控制行为从模拟环境转移到现实环境的挑战，检查了不同热区的性能，并评估了建筑系统中控制策略的集成灵活性。该系统结合了现实世界的乘员行为，包括动态舒适偏好和乘员预测，以更好地将恒温器的操作与用户偏好结合起来。此外，本文还讨论了在实施过程中遇到的实际挑战，包括物联网设备的电池消耗、占用检测和预测系统的集成以及维护需求。通过解决这些问题，提出的控制策略旨在提高基于物联网的trv的可扩展性和可行性，从而为其在建筑物中的广泛部署提供可行的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊