Leveraging machine learning for accurate PM2.5 concentration prediction in selected Nigerian locations

IF 1.8 4区 地球科学 Q3 GEOCHEMISTRY & GEOPHYSICS
O.A. Falaiye , O.F. Odubanjo , M. Sanni
{"title":"Leveraging machine learning for accurate PM2.5 concentration prediction in selected Nigerian locations","authors":"O.A. Falaiye ,&nbsp;O.F. Odubanjo ,&nbsp;M. Sanni","doi":"10.1016/j.jastp.2025.106515","DOIUrl":null,"url":null,"abstract":"<div><div>Air pollution, particularly from fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and health threats. Accurately predicting PM<sub>2.5</sub> concentrations can greatly assist policymakers in developing effective mitigation strategies. This research evaluates the performance of four popular machine learning models—Random Forest (RF), Gradient Boosting (GB), Support Vector Machine (SVM), and Multiple Linear Regression (MLR)—in predicting PM<sub>2.5</sub> concentrations across several Nigerian cities: Abuja, Anyigba, Benin City, and Osogbo. The study utilized hourly PM<sub>2.5</sub> data from the Center for Atmospheric Research (CAR) Nigeria's Purple Air Real-Time Air Quality Sensors Network and meteorological data from the HelioClim website of solar radiation and meteorological data services. Evaluation metrics included Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared (R<sup>2</sup>) to assess model performance. The results indicate that mean PM<sub>2.5</sub> concentrations varied by location, with Benin City recording the highest levels of 46.19 μg/m<sup>3</sup> and Anyigba the lowest at 14.36 μg/m<sup>3</sup>; higher levels were observed in the dry season across all locations. MAE values ranged from 2.25 μg/m<sup>3</sup> (RF in Anyigba) to 12.43 μg/m<sup>3</sup> (MLR in Benin City). The RF model consistently outperformed the others, achieving the highest R<sup>2</sup> values (up to 0.89 in Anyigba) and the lowest RMSE (3.55 μg/m<sup>3</sup> in Anyigba). In contrast, the GB model demonstrated moderate performance with R<sup>2</sup> values around 0.68, while the SVM model exhibited the lowest overall performance. T<strong>emperature has the highest average importance percentage across the selected locations, making it the best predictor.</strong> These findings underscore the effectiveness of the RF model for PM<sub>2.5</sub> prediction and suggest that future research should explore the incorporation of additional gaseous pollutants, such as O<sub>3</sub>, NO<sub>2</sub>, and SO<sub>2</sub>, to enhance predictive capabilities.</div></div>","PeriodicalId":15096,"journal":{"name":"Journal of Atmospheric and Solar-Terrestrial Physics","volume":"271 ","pages":"Article 106515"},"PeriodicalIF":1.8000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Atmospheric and Solar-Terrestrial Physics","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364682625000999","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

Air pollution, particularly from fine particulate matter (PM2.5), poses significant environmental and health threats. Accurately predicting PM2.5 concentrations can greatly assist policymakers in developing effective mitigation strategies. This research evaluates the performance of four popular machine learning models—Random Forest (RF), Gradient Boosting (GB), Support Vector Machine (SVM), and Multiple Linear Regression (MLR)—in predicting PM2.5 concentrations across several Nigerian cities: Abuja, Anyigba, Benin City, and Osogbo. The study utilized hourly PM2.5 data from the Center for Atmospheric Research (CAR) Nigeria's Purple Air Real-Time Air Quality Sensors Network and meteorological data from the HelioClim website of solar radiation and meteorological data services. Evaluation metrics included Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared (R2) to assess model performance. The results indicate that mean PM2.5 concentrations varied by location, with Benin City recording the highest levels of 46.19 μg/m3 and Anyigba the lowest at 14.36 μg/m3; higher levels were observed in the dry season across all locations. MAE values ranged from 2.25 μg/m3 (RF in Anyigba) to 12.43 μg/m3 (MLR in Benin City). The RF model consistently outperformed the others, achieving the highest R2 values (up to 0.89 in Anyigba) and the lowest RMSE (3.55 μg/m3 in Anyigba). In contrast, the GB model demonstrated moderate performance with R2 values around 0.68, while the SVM model exhibited the lowest overall performance. Temperature has the highest average importance percentage across the selected locations, making it the best predictor. These findings underscore the effectiveness of the RF model for PM2.5 prediction and suggest that future research should explore the incorporation of additional gaseous pollutants, such as O3, NO2, and SO2, to enhance predictive capabilities.
利用机器学习在尼日利亚选定的地点进行准确的PM2.5浓度预测
空气污染,特别是细颗粒物(PM2.5)造成的空气污染,对环境和健康构成重大威胁。准确预测PM2.5浓度可以极大地帮助决策者制定有效的减缓战略。本研究评估了四种流行的机器学习模型——随机森林(RF)、梯度增强(GB)、支持向量机(SVM)和多元线性回归(MLR)——在预测尼日利亚几个城市(阿布贾、阿尼伊巴、贝宁市和奥索博)PM2.5浓度方面的表现。这项研究利用了来自尼日利亚大气研究中心(CAR)紫色空气实时空气质量传感器网络的每小时PM2.5数据,以及来自HelioClim太阳辐射和气象数据服务网站的气象数据。评估指标包括平均绝对误差(MAE)、均方根误差(RMSE)和r平方(R2)来评估模型的性能。结果表明,PM2.5平均浓度因地区而异,贝宁市最高,为46.19 μg/m3,安尼格巴最低,为14.36 μg/m3;所有地点的旱季都观察到较高的水平。MAE值为2.25 μg/m3(安尼格巴市RF) ~ 12.43 μg/m3(贝宁市MLR)。RF模型的R2最高(Anyigba为0.89),RMSE最低(Anyigba为3.55 μg/m3),均优于其他模型。相比之下,GB模型表现出中等的性能,R2值在0.68左右,而SVM模型表现出最低的整体性能。温度在选定地点的平均重要性百分比最高,使其成为最佳预测器。这些发现强调了RF模型对PM2.5预测的有效性,并建议未来的研究应探索纳入额外的气态污染物,如O3、NO2和SO2,以增强预测能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Atmospheric and Solar-Terrestrial Physics
Journal of Atmospheric and Solar-Terrestrial Physics 地学-地球化学与地球物理
CiteScore
4.10
自引率
5.30%
发文量
95
审稿时长
6 months
期刊介绍: The Journal of Atmospheric and Solar-Terrestrial Physics (JASTP) is an international journal concerned with the inter-disciplinary science of the Earth''s atmospheric and space environment, especially the highly varied and highly variable physical phenomena that occur in this natural laboratory and the processes that couple them. The journal covers the physical processes operating in the troposphere, stratosphere, mesosphere, thermosphere, ionosphere, magnetosphere, the Sun, interplanetary medium, and heliosphere. Phenomena occurring in other "spheres", solar influences on climate, and supporting laboratory measurements are also considered. The journal deals especially with the coupling between the different regions. Solar flares, coronal mass ejections, and other energetic events on the Sun create interesting and important perturbations in the near-Earth space environment. The physics of such "space weather" is central to the Journal of Atmospheric and Solar-Terrestrial Physics and the journal welcomes papers that lead in the direction of a predictive understanding of the coupled system. Regarding the upper atmosphere, the subjects of aeronomy, geomagnetism and geoelectricity, auroral phenomena, radio wave propagation, and plasma instabilities, are examples within the broad field of solar-terrestrial physics which emphasise the energy exchange between the solar wind, the magnetospheric and ionospheric plasmas, and the neutral gas. In the lower atmosphere, topics covered range from mesoscale to global scale dynamics, to atmospheric electricity, lightning and its effects, and to anthropogenic changes.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信