{"title":"Hybrid machine learning to enhance PM2.5 forecasting performance by the WRF-Chem model","authors":"Laddawan Noynoo , Perapong Tekasakul , Thanathip Limna , Chidchanok Choksuchat , Korakot Wichitsa-Nguan Jetwanna , Chuen-Jinn Tsai , Thi-Cuc Le , Panwadee Suwattiga , John Morris , Racha Dejchanchaiwong","doi":"10.1016/j.apr.2025.102558","DOIUrl":null,"url":null,"abstract":"<div><div>The weather research and forecasting with chemistry (WRF-Chem) model had been widely used in PM<sub>2.5</sub> concentration forecasting. However, uncertainties in global emission inventories, meteorological data, and simplified chemical parameterizations continue to pose challenges. We evaluated the performance of the original WRF-Chem model and three models augmented with machine learning (ML) algorithms, i.e. Long Short-Term Memory (LSTM), Extreme Gradient Boosting (XGBoost) and XGBoost-LSTM (Hybrid) approaches, to enhance forecasting accuracy. The ML models were trained and tested using dataset from WRF-Chem-simulated meteorological and pollutant data at four monitoring stations in southern Thailand during the year 2019–2020. The WRF-Chem-Hybrid model significantly improved all metrics in the original WRF-Chem results - with R<sup>2</sup> increasing from insignificant to 0.90 or more, RMSE decreasing from 7.00-15.17 μg/m<sup>3</sup> to 1.34–3.47 μg/m<sup>3</sup>, and MAE decreasing from 5.52-10.55 μg/m<sup>3</sup> to 0.79–1.49 μg/m<sup>3</sup>. The validation test during the entire 2021 performed well, with R<sup>2</sup> = 0.94–0.96, RMSE = 1.56–2.48 μg/m<sup>3</sup> and MAE = 1.01–1.56 μg/m<sup>3</sup>. The WRF-Chem-Hybrid model forecast PM<sub>2.5</sub> concentrations for 72 h in advance, with R<sup>2</sup> = 0.70–0.89, RMSE = 4.64–11.99 μg/m<sup>3</sup>, and MAE = 3.07–8.38 μg/m<sup>3</sup>. Thus, the hybrid model is suggested for forecasting PM<sub>2.5</sub> concentrations over southern Thailand and other regions up to 72 h in advance. Overall, this study demonstrated the advantages of augmenting the WRF-Chem model to form hybrid ML models to more accurately forecast PM<sub>2.5</sub> levels, their distribution and evolution over time, particularly in regions where PM<sub>2.5</sub> levels were affected by open biomass burning from both local and cross-border emissions.</div></div>","PeriodicalId":8604,"journal":{"name":"Atmospheric Pollution Research","volume":"16 8","pages":"Article 102558"},"PeriodicalIF":3.9000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmospheric Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1309104225001606","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The weather research and forecasting with chemistry (WRF-Chem) model had been widely used in PM2.5 concentration forecasting. However, uncertainties in global emission inventories, meteorological data, and simplified chemical parameterizations continue to pose challenges. We evaluated the performance of the original WRF-Chem model and three models augmented with machine learning (ML) algorithms, i.e. Long Short-Term Memory (LSTM), Extreme Gradient Boosting (XGBoost) and XGBoost-LSTM (Hybrid) approaches, to enhance forecasting accuracy. The ML models were trained and tested using dataset from WRF-Chem-simulated meteorological and pollutant data at four monitoring stations in southern Thailand during the year 2019–2020. The WRF-Chem-Hybrid model significantly improved all metrics in the original WRF-Chem results - with R2 increasing from insignificant to 0.90 or more, RMSE decreasing from 7.00-15.17 μg/m3 to 1.34–3.47 μg/m3, and MAE decreasing from 5.52-10.55 μg/m3 to 0.79–1.49 μg/m3. The validation test during the entire 2021 performed well, with R2 = 0.94–0.96, RMSE = 1.56–2.48 μg/m3 and MAE = 1.01–1.56 μg/m3. The WRF-Chem-Hybrid model forecast PM2.5 concentrations for 72 h in advance, with R2 = 0.70–0.89, RMSE = 4.64–11.99 μg/m3, and MAE = 3.07–8.38 μg/m3. Thus, the hybrid model is suggested for forecasting PM2.5 concentrations over southern Thailand and other regions up to 72 h in advance. Overall, this study demonstrated the advantages of augmenting the WRF-Chem model to form hybrid ML models to more accurately forecast PM2.5 levels, their distribution and evolution over time, particularly in regions where PM2.5 levels were affected by open biomass burning from both local and cross-border emissions.
期刊介绍:
Atmospheric Pollution Research (APR) is an international journal designed for the publication of articles on air pollution. Papers should present novel experimental results, theory and modeling of air pollution on local, regional, or global scales. Areas covered are research on inorganic, organic, and persistent organic air pollutants, air quality monitoring, air quality management, atmospheric dispersion and transport, air-surface (soil, water, and vegetation) exchange of pollutants, dry and wet deposition, indoor air quality, exposure assessment, health effects, satellite measurements, natural emissions, atmospheric chemistry, greenhouse gases, and effects on climate change.