An unsupervised machine learning approach for indoor air pollution analysis

IF 3.5 Q3 ENVIRONMENTAL SCIENCES
Bárbara A. Macías-Hernández, Edgar Tello-Leal, Jailene Marlen Jaramillo-Perez and René Ventura-Houle
{"title":"An unsupervised machine learning approach for indoor air pollution analysis","authors":"Bárbara A. Macías-Hernández, Edgar Tello-Leal, Jailene Marlen Jaramillo-Perez and René Ventura-Houle","doi":"10.1039/D5EA00051C","DOIUrl":null,"url":null,"abstract":"<p >Exposure to indoor air pollutants is one of the most significant environmental and health risks people face, especially since they spend most of their time indoors. Therefore, evaluating indoor air pollution levels and comfort parameters is essential for achieving sustainable indoor air quality (IAQ). The main objective of this study was to identify patterns of indoor air pollution in two buildings with different characteristics located on a university campus in northeastern Mexico. We measured the concentration of particulate matter in fractions of 1.0 μm (PM<small><sub>1</sub></small>), 2.5 μm (PM<small><sub>2.5</sub></small>), and 10 μm (PM<small><sub>10</sub></small>), as well as carbon dioxide (CO<small><sub>2</sub></small>), carbon monoxide (CO), and ozone (O<small><sub>3</sub></small>), along with the temperature and relative humidity in each microenvironment during the working hours of spring, summer, and autumn. Next, unsupervised machine learning was employed to identify behavioral patterns of air pollutants within the microenvironments. The <em>K</em>-means clustering algorithm was used to identify homogeneous microenvironments within the study area. We performed three clustering analyses per building: (1) considering all the variables in the dataset, (2)selecting the significant variables through principal component analysis (PCA), and (3) examining two time ranges within the working day. The robustness of the proposed approach was evaluated through a comparative analysis of the <em>K</em>-means, DBScan, and hierarchical algorithms, assessing their performance using the Davies–Bouldin index and Silhouette score metrics. Furthermore, the stability of the clusters over time intervals was assessed using the adjusted Rand index. Cluster analysis enabled us to identify microenvironments with maximum similarity and those that change groups, as their behavior depends on the time range. Consequently, grouping microenvironments into homogeneous IAQ classes is effective in accurately identifying spaces based on patterns related to their contamination levels and guiding actions to reduce pollution levels by zone or building.</p>","PeriodicalId":72942,"journal":{"name":"Environmental science: atmospheres","volume":" 10","pages":" 1144-1157"},"PeriodicalIF":3.5000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/ea/d5ea00051c?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental science: atmospheres","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/ea/d5ea00051c","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Exposure to indoor air pollutants is one of the most significant environmental and health risks people face, especially since they spend most of their time indoors. Therefore, evaluating indoor air pollution levels and comfort parameters is essential for achieving sustainable indoor air quality (IAQ). The main objective of this study was to identify patterns of indoor air pollution in two buildings with different characteristics located on a university campus in northeastern Mexico. We measured the concentration of particulate matter in fractions of 1.0 μm (PM1), 2.5 μm (PM2.5), and 10 μm (PM10), as well as carbon dioxide (CO2), carbon monoxide (CO), and ozone (O3), along with the temperature and relative humidity in each microenvironment during the working hours of spring, summer, and autumn. Next, unsupervised machine learning was employed to identify behavioral patterns of air pollutants within the microenvironments. The K-means clustering algorithm was used to identify homogeneous microenvironments within the study area. We performed three clustering analyses per building: (1) considering all the variables in the dataset, (2)selecting the significant variables through principal component analysis (PCA), and (3) examining two time ranges within the working day. The robustness of the proposed approach was evaluated through a comparative analysis of the K-means, DBScan, and hierarchical algorithms, assessing their performance using the Davies–Bouldin index and Silhouette score metrics. Furthermore, the stability of the clusters over time intervals was assessed using the adjusted Rand index. Cluster analysis enabled us to identify microenvironments with maximum similarity and those that change groups, as their behavior depends on the time range. Consequently, grouping microenvironments into homogeneous IAQ classes is effective in accurately identifying spaces based on patterns related to their contamination levels and guiding actions to reduce pollution levels by zone or building.

Abstract Image

室内空气污染分析的无监督机器学习方法
接触室内空气污染物是人们面临的最严重的环境和健康风险之一,特别是因为他们大部分时间都在室内度过。因此,评估室内空气污染水平和舒适参数对于实现可持续的室内空气质量(IAQ)至关重要。本研究的主要目的是确定位于墨西哥东北部一所大学校园内具有不同特征的两栋建筑的室内空气污染模式。我们测量了1.0 μm (PM1), 2.5 μm (PM2.5)和10 μm (PM10)的颗粒物浓度,以及二氧化碳(CO2),一氧化碳(CO)和臭氧(O3),以及春季,夏季和秋季工作时间每个微环境的温度和相对湿度。接下来,采用无监督机器学习来识别微环境中空气污染物的行为模式。采用k均值聚类算法识别研究区内的同质微环境。我们对每个建筑进行了三个聚类分析:(1)考虑数据集中的所有变量,(2)通过主成分分析(PCA)选择显著变量,(3)检查工作日内的两个时间范围。通过K-means、DBScan和分层算法的比较分析来评估所提出方法的稳健性,并使用Davies-Bouldin指数和Silhouette评分指标来评估它们的性能。此外,使用调整后的Rand指数评估聚类随时间间隔的稳定性。聚类分析使我们能够识别具有最大相似性的微环境和那些改变群体的微环境,因为它们的行为取决于时间范围。因此,将微环境分组为同质的室内空气质量类别,可以有效地根据与其污染水平相关的模式准确识别空间,并指导通过区域或建筑物减少污染水平的行动。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信