Jongkeun Lee , Young Su Lee , Joo-Ae Kim , Seulki Jeong
{"title":"Machine learning approaches to predict oxidative potential of fine particulate matter based on chemical constituents","authors":"Jongkeun Lee , Young Su Lee , Joo-Ae Kim , Seulki Jeong","doi":"10.1016/j.engappai.2025.111170","DOIUrl":null,"url":null,"abstract":"<div><div>Exposure to fine particulate matter (PM<sub>2.5</sub>) poses significant health risks, primarily due to its oxidative potential (OP), which induces oxidative stress and related diseases. This study aimed to predict the OP of PM<sub>2.5</sub> based on its chemical constituents using machine learning (ML) models. We collected 119 p.m.<sub>2.5</sub> samples from Seoul, Korea, between 2019 and 2021, and analyzed their chemical composition and OP using the dithiothreitol (DTT) assay. Three ML models—k-Nearest Neighbors (kNN), Random Forest (RF), and Fully Connected Deep Neural Network (FCDNN)—were developed to predict OP. Among them, the RF model demonstrated the highest prediction accuracy, with coefficient of determination (R<sup>2</sup>) values ranging from 0.88 to 0.89 for training data and 0.36 to 0.62 for test data, followed by Extreme Gradient Boosting (XGBoost) and FCDNN with test R<sup>2</sup> values up to 0.53 and 0.39, respectively. Explainable Artificial Intelligence (AI) techniques, specifically feature importance and SHapley Additive exPlanations (SHAP), were employed to enhance the interpretability of the model, revealing the significant contributions of various chemical constituents. The study underscores the mixed effects of multiple factors on OP and highlights the potential of AI in providing robust predictive tools for environmental health. As OP measurement automation progresses, the availability of large datasets will further improve the accuracy and applicability of AI models, facilitating better health risk assessments and policy-making.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111170"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625011716","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Exposure to fine particulate matter (PM2.5) poses significant health risks, primarily due to its oxidative potential (OP), which induces oxidative stress and related diseases. This study aimed to predict the OP of PM2.5 based on its chemical constituents using machine learning (ML) models. We collected 119 p.m.2.5 samples from Seoul, Korea, between 2019 and 2021, and analyzed their chemical composition and OP using the dithiothreitol (DTT) assay. Three ML models—k-Nearest Neighbors (kNN), Random Forest (RF), and Fully Connected Deep Neural Network (FCDNN)—were developed to predict OP. Among them, the RF model demonstrated the highest prediction accuracy, with coefficient of determination (R2) values ranging from 0.88 to 0.89 for training data and 0.36 to 0.62 for test data, followed by Extreme Gradient Boosting (XGBoost) and FCDNN with test R2 values up to 0.53 and 0.39, respectively. Explainable Artificial Intelligence (AI) techniques, specifically feature importance and SHapley Additive exPlanations (SHAP), were employed to enhance the interpretability of the model, revealing the significant contributions of various chemical constituents. The study underscores the mixed effects of multiple factors on OP and highlights the potential of AI in providing robust predictive tools for environmental health. As OP measurement automation progresses, the availability of large datasets will further improve the accuracy and applicability of AI models, facilitating better health risk assessments and policy-making.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.