{"title":"Leveraging machine learning for accurate PM2.5 concentration prediction in selected Nigerian locations","authors":"O.A. Falaiye , O.F. Odubanjo , M. Sanni","doi":"10.1016/j.jastp.2025.106515","DOIUrl":null,"url":null,"abstract":"<div><div>Air pollution, particularly from fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and health threats. Accurately predicting PM<sub>2.5</sub> concentrations can greatly assist policymakers in developing effective mitigation strategies. This research evaluates the performance of four popular machine learning models—Random Forest (RF), Gradient Boosting (GB), Support Vector Machine (SVM), and Multiple Linear Regression (MLR)—in predicting PM<sub>2.5</sub> concentrations across several Nigerian cities: Abuja, Anyigba, Benin City, and Osogbo. The study utilized hourly PM<sub>2.5</sub> data from the Center for Atmospheric Research (CAR) Nigeria's Purple Air Real-Time Air Quality Sensors Network and meteorological data from the HelioClim website of solar radiation and meteorological data services. Evaluation metrics included Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared (R<sup>2</sup>) to assess model performance. The results indicate that mean PM<sub>2.5</sub> concentrations varied by location, with Benin City recording the highest levels of 46.19 μg/m<sup>3</sup> and Anyigba the lowest at 14.36 μg/m<sup>3</sup>; higher levels were observed in the dry season across all locations. MAE values ranged from 2.25 μg/m<sup>3</sup> (RF in Anyigba) to 12.43 μg/m<sup>3</sup> (MLR in Benin City). The RF model consistently outperformed the others, achieving the highest R<sup>2</sup> values (up to 0.89 in Anyigba) and the lowest RMSE (3.55 μg/m<sup>3</sup> in Anyigba). In contrast, the GB model demonstrated moderate performance with R<sup>2</sup> values around 0.68, while the SVM model exhibited the lowest overall performance. T<strong>emperature has the highest average importance percentage across the selected locations, making it the best predictor.</strong> These findings underscore the effectiveness of the RF model for PM<sub>2.5</sub> prediction and suggest that future research should explore the incorporation of additional gaseous pollutants, such as O<sub>3</sub>, NO<sub>2</sub>, and SO<sub>2</sub>, to enhance predictive capabilities.</div></div>","PeriodicalId":15096,"journal":{"name":"Journal of Atmospheric and Solar-Terrestrial Physics","volume":"271 ","pages":"Article 106515"},"PeriodicalIF":1.8000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Atmospheric and Solar-Terrestrial Physics","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364682625000999","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
Air pollution, particularly from fine particulate matter (PM2.5), poses significant environmental and health threats. Accurately predicting PM2.5 concentrations can greatly assist policymakers in developing effective mitigation strategies. This research evaluates the performance of four popular machine learning models—Random Forest (RF), Gradient Boosting (GB), Support Vector Machine (SVM), and Multiple Linear Regression (MLR)—in predicting PM2.5 concentrations across several Nigerian cities: Abuja, Anyigba, Benin City, and Osogbo. The study utilized hourly PM2.5 data from the Center for Atmospheric Research (CAR) Nigeria's Purple Air Real-Time Air Quality Sensors Network and meteorological data from the HelioClim website of solar radiation and meteorological data services. Evaluation metrics included Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared (R2) to assess model performance. The results indicate that mean PM2.5 concentrations varied by location, with Benin City recording the highest levels of 46.19 μg/m3 and Anyigba the lowest at 14.36 μg/m3; higher levels were observed in the dry season across all locations. MAE values ranged from 2.25 μg/m3 (RF in Anyigba) to 12.43 μg/m3 (MLR in Benin City). The RF model consistently outperformed the others, achieving the highest R2 values (up to 0.89 in Anyigba) and the lowest RMSE (3.55 μg/m3 in Anyigba). In contrast, the GB model demonstrated moderate performance with R2 values around 0.68, while the SVM model exhibited the lowest overall performance. Temperature has the highest average importance percentage across the selected locations, making it the best predictor. These findings underscore the effectiveness of the RF model for PM2.5 prediction and suggest that future research should explore the incorporation of additional gaseous pollutants, such as O3, NO2, and SO2, to enhance predictive capabilities.
期刊介绍:
The Journal of Atmospheric and Solar-Terrestrial Physics (JASTP) is an international journal concerned with the inter-disciplinary science of the Earth''s atmospheric and space environment, especially the highly varied and highly variable physical phenomena that occur in this natural laboratory and the processes that couple them.
The journal covers the physical processes operating in the troposphere, stratosphere, mesosphere, thermosphere, ionosphere, magnetosphere, the Sun, interplanetary medium, and heliosphere. Phenomena occurring in other "spheres", solar influences on climate, and supporting laboratory measurements are also considered. The journal deals especially with the coupling between the different regions.
Solar flares, coronal mass ejections, and other energetic events on the Sun create interesting and important perturbations in the near-Earth space environment. The physics of such "space weather" is central to the Journal of Atmospheric and Solar-Terrestrial Physics and the journal welcomes papers that lead in the direction of a predictive understanding of the coupled system. Regarding the upper atmosphere, the subjects of aeronomy, geomagnetism and geoelectricity, auroral phenomena, radio wave propagation, and plasma instabilities, are examples within the broad field of solar-terrestrial physics which emphasise the energy exchange between the solar wind, the magnetospheric and ionospheric plasmas, and the neutral gas. In the lower atmosphere, topics covered range from mesoscale to global scale dynamics, to atmospheric electricity, lightning and its effects, and to anthropogenic changes.