Estimation of the air conditioning energy consumption of a classroom using machine learning in a tropical climate.

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Frontiers in Big Data Pub Date : 2025-05-14 eCollection Date: 2025-01-01 DOI:10.3389/fdata.2025.1520574
Liliana Ortega-Diaz, Julian Jaramillo-Ibarra, German Osma-Pinto
{"title":"Estimation of the air conditioning energy consumption of a classroom using machine learning in a tropical climate.","authors":"Liliana Ortega-Diaz, Julian Jaramillo-Ibarra, German Osma-Pinto","doi":"10.3389/fdata.2025.1520574","DOIUrl":null,"url":null,"abstract":"<p><p>Air conditioning energy consumption in buildings represents a considerable percentage of total energy consumption, which underlines the importance of implementing measures contributing to its reduction. Predicting energy consumption is critical to making informed decisions and identifying factors influencing power consumption. Machine learning is the most widely used approach for prediction due to its speed, accuracy, and non-linear modeling. In this study, three machine learning models were used to predict the air conditioning energy demand in a classroom of an educational building in a hot tropical climate. The models selected are SVR (Support Vector Regressor), DT (Decision Tree), and RFR (Random Forest Regressor) due to their wide use in the literature; therefore, the goal is to establish which one offers the best performance for this case study based on a comparative analysis using performance metrics. Cross-validation was used to perform robust training. Twenty-two input variables were considered: climatological, operational, and temporal. Occupancy is the variable with the highest correlation with air conditioning consumption; these two variables have a positive relationship of 0.65. Monitoring was carried out for 72 days, including weekends. Six study scenarios were considered, in which the monitoring period varied, influencing the number of samples. In addition, two sensitivity analyses were performed by modifying the time interval of the data (1, 5, 10, 20, 30, and 60 min) and the data split (50:50, 60:40, 70:30, 80:20 and 90:10). The evaluation of the models was performed using RMSE, MAE and <i>R</i> <sup>2</sup> metrics, to different characteristics and approaches to error measurement. During the training phase, the RFR model achieved a coefficient of determination (<i>R</i> <sup>2</sup>) of 0.97, while the SVR obtained an <i>R</i> <sup>2</sup> of 0.78 in the test phase. Finally, it is concluded that using shorter time intervals (every 1 min) in the data improves the performance of the predictive models. Splitting the data into 80:20 and 90:10 ratios resulted in the lowest RMSE values for the three models evaluated. Training the models with a larger amount of data allows for capturing more representative patterns, which improves their generalization ability and performance on new data.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1520574"},"PeriodicalIF":2.4000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12116678/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdata.2025.1520574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Air conditioning energy consumption in buildings represents a considerable percentage of total energy consumption, which underlines the importance of implementing measures contributing to its reduction. Predicting energy consumption is critical to making informed decisions and identifying factors influencing power consumption. Machine learning is the most widely used approach for prediction due to its speed, accuracy, and non-linear modeling. In this study, three machine learning models were used to predict the air conditioning energy demand in a classroom of an educational building in a hot tropical climate. The models selected are SVR (Support Vector Regressor), DT (Decision Tree), and RFR (Random Forest Regressor) due to their wide use in the literature; therefore, the goal is to establish which one offers the best performance for this case study based on a comparative analysis using performance metrics. Cross-validation was used to perform robust training. Twenty-two input variables were considered: climatological, operational, and temporal. Occupancy is the variable with the highest correlation with air conditioning consumption; these two variables have a positive relationship of 0.65. Monitoring was carried out for 72 days, including weekends. Six study scenarios were considered, in which the monitoring period varied, influencing the number of samples. In addition, two sensitivity analyses were performed by modifying the time interval of the data (1, 5, 10, 20, 30, and 60 min) and the data split (50:50, 60:40, 70:30, 80:20 and 90:10). The evaluation of the models was performed using RMSE, MAE and R 2 metrics, to different characteristics and approaches to error measurement. During the training phase, the RFR model achieved a coefficient of determination (R 2) of 0.97, while the SVR obtained an R 2 of 0.78 in the test phase. Finally, it is concluded that using shorter time intervals (every 1 min) in the data improves the performance of the predictive models. Splitting the data into 80:20 and 90:10 ratios resulted in the lowest RMSE values for the three models evaluated. Training the models with a larger amount of data allows for capturing more representative patterns, which improves their generalization ability and performance on new data.

Abstract Image

Abstract Image

Abstract Image

利用机器学习估算热带气候下教室空调能耗。
建筑物的空调能源消耗占总能源消耗的相当大比例,因此,采取措施减少空调能源消耗十分重要。预测能源消耗对于做出明智的决策和确定影响能源消耗的因素至关重要。机器学习由于其速度、准确性和非线性建模而成为最广泛使用的预测方法。在本研究中,使用三种机器学习模型来预测炎热热带气候下教育建筑教室的空调能源需求。选择的模型是SVR(支持向量回归器)、DT(决策树)和RFR(随机森林回归器),因为它们在文献中被广泛使用;因此,我们的目标是根据使用性能指标的比较分析,确定哪一个为本案例研究提供了最佳性能。交叉验证用于鲁棒性训练。考虑了22个输入变量:气候、操作和时间。占用率是与空调消耗相关性最高的变量;这两个变量的正相关系数为0.65。监测为期72天,包括周末。考虑了六种研究情景,其中监测周期不同,影响样本数量。此外,通过修改数据的时间间隔(1、5、10、20、30和60 min)和数据分割(50:50、60:40、70:30、80:20和90:10)进行敏感性分析。采用RMSE, MAE和r2指标对模型进行评估,以不同的特征和误差测量方法。在训练阶段,RFR模型的决定系数(r2)为0.97,而SVR在测试阶段的r2为0.78。最后得出结论,在数据中使用较短的时间间隔(每1分钟)可以提高预测模型的性能。将数据分成80:20和90:10的比例,这三种模型的RMSE值最低。使用大量的数据训练模型可以捕获更多的代表性模式,从而提高模型在新数据上的泛化能力和性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.20
自引率
3.20%
发文量
122
审稿时长
13 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信