Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm

IF 2.2 Q2 CONSTRUCTION & BUILDING TECHNOLOGY
Zhenwei Guo, Xinyu Wang, Liang Ge
{"title":"Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm","authors":"Zhenwei Guo, Xinyu Wang, Liang Ge","doi":"10.3389/fbuil.2023.1207193","DOIUrl":null,"url":null,"abstract":"It is increasingly important to create a healthier indoor environment for office buildings. Accurate and reliable prediction of PM2.5 concentration can effectively alleviate the delay problem of indoor air quality control system. The rapid development of machine learning has provided a research basis for the indoor air quality system to control the PM2.5 concentration. One approach is to introduce the CatBoost algorithm based on rank lifting training into the classification and prediction of indoor PM2.5 concentration. Using actual monitoring data from office building, we consider previous indoor PM2.5 concentration, indoor temperature, relative humidity, CO2 concentration, and illumination as input variables, with the output indicating whether indoor PM2.5 concentration exceeds 25 μg/m3. Based on the CatBoost algorithm, we construct an intelligent classification prediction model for indoor PM2.5 concentration. The model is evaluated using actual data and compared with the multilayer perceptron (MLP), gradientboosting decision tree (GBDT), logistic regression (LR), decision tree (DT), and k-nearest neighbors (KNN) models. The CatBoost algorithm demonstrates outstanding predictive performance, achieving an impressive area under the ROC curve (AUC) of 0.949 after hyperparameters optimition. Furthermore, when considering the five input variables, the feature importance is ranked as follows: previous indoor PM2.5 concentration, relative humidity, CO2, indoor temperature, and illuminance. Through verification, the prediction model based on CatBoost algorithm can accurately predict the indoor PM2.5 concentration level. The model can be used to predict whether the indoor concentration of PM2.5 exceeds the standard in advance and guide the air quality control system to regulate.","PeriodicalId":37112,"journal":{"name":"Frontiers in Built Environment","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Built Environment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fbuil.2023.1207193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

It is increasingly important to create a healthier indoor environment for office buildings. Accurate and reliable prediction of PM2.5 concentration can effectively alleviate the delay problem of indoor air quality control system. The rapid development of machine learning has provided a research basis for the indoor air quality system to control the PM2.5 concentration. One approach is to introduce the CatBoost algorithm based on rank lifting training into the classification and prediction of indoor PM2.5 concentration. Using actual monitoring data from office building, we consider previous indoor PM2.5 concentration, indoor temperature, relative humidity, CO2 concentration, and illumination as input variables, with the output indicating whether indoor PM2.5 concentration exceeds 25 μg/m3. Based on the CatBoost algorithm, we construct an intelligent classification prediction model for indoor PM2.5 concentration. The model is evaluated using actual data and compared with the multilayer perceptron (MLP), gradientboosting decision tree (GBDT), logistic regression (LR), decision tree (DT), and k-nearest neighbors (KNN) models. The CatBoost algorithm demonstrates outstanding predictive performance, achieving an impressive area under the ROC curve (AUC) of 0.949 after hyperparameters optimition. Furthermore, when considering the five input variables, the feature importance is ranked as follows: previous indoor PM2.5 concentration, relative humidity, CO2, indoor temperature, and illuminance. Through verification, the prediction model based on CatBoost algorithm can accurately predict the indoor PM2.5 concentration level. The model can be used to predict whether the indoor concentration of PM2.5 exceeds the standard in advance and guide the air quality control system to regulate.
基于CatBoost算法的室内PM2.5浓度分类预测模型
为办公楼创造一个更健康的室内环境变得越来越重要。准确可靠地预测PM2.5浓度可以有效缓解室内空气质量控制系统的延迟问题。机器学习的快速发展为室内空气质量系统控制PM2.5浓度提供了研究基础。一种方法是将基于秩提升训练的CatBoost算法引入室内PM2.5浓度的分类和预测。利用办公楼的实际监测数据,我们将以前的室内PM2.5浓度、室内温度、相对湿度、CO2浓度和照度作为输入变量,输出指示室内PM2.5的浓度是否超过25微克/立方米。基于CatBoost算法,构建了室内PM2.5浓度的智能分类预测模型。该模型使用实际数据进行评估,并与多层感知器(MLP)、梯度提升决策树(GBDT)、逻辑回归(LR)、决策树(DT)和k近邻(KNN)模型进行比较。CatBoost算法表现出出色的预测性能,在超参数优化后,ROC曲线下面积(AUC)达到0.949,令人印象深刻。此外,当考虑五个输入变量时,特征重要性排序如下:以前的室内PM2.5浓度、相对湿度、CO2、室内温度和照度。经过验证,基于CatBoost算法的预测模型能够准确预测室内PM2.5浓度水平。该模型可用于提前预测室内PM2.5浓度是否超标,指导空气质量控制系统进行调控。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Frontiers in Built Environment
Frontiers in Built Environment Social Sciences-Urban Studies
CiteScore
4.80
自引率
6.70%
发文量
266
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信