{"title":"An evolutionary deep learning model based on XGBoost feature selection and Gaussian data augmentation for AQI prediction","authors":"","doi":"10.1016/j.psep.2024.08.119","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate prediction of air quality is crucial for ensuring the scientific validity and effectiveness of air pollution control measures. This study proposes a combined deep learning (DL) model (XGBoost-GDA-TCN-IMRFO-GRU) for predicting hourly air quality index (AQI) data in four cities. The model integrates Extreme gradient boosting (XGBoost) for feature selection, Gaussian data augmentation (GDA), improved manta ray foraging optimization (IMRFO) algorithm, temporal convolutional network (TCN), and gated recurrent unit (GRU). XGBoost calculates the scores of pollutant gases affecting AQI, selecting the top four important pollutants (PM<sub>2.5</sub>, PM<sub>10</sub>, NO<sub>2</sub>, O<sub>3</sub>) based on their importance rankings. GDA enhances the robustness of the DL models and addresses the limitations of insufficient and overfitting training datasets. Additionally, the IMRFO algorithm, with two improved strategies, is applied to enhance the GRU model. TCN extracts spatiotemporal features of AQI, while GRU constructs a temporal model for efficient computations. Compared to eleven benchmark models, the proposed model demonstrates superior performance in terms of MAE, RMSE, MAPE, and NSE, achieving high accuracy and optimal prediction performance. Specifically, the XGBoost-GDA-TCN-IMRFO-GRU model reduces RMSE, MAE, and MAPE by 33–60 %, 39–68 %, and 39–66 %, respectively, compared to the TCN model. Therefore, the XGBoost-GDA-TCN-IMRFO-GRU model can provide reliable early warnings for air quality, which is of great significance for air pollution prevention and the sustainable development of society.</p></div>","PeriodicalId":20743,"journal":{"name":"Process Safety and Environmental Protection","volume":null,"pages":null},"PeriodicalIF":6.9000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Process Safety and Environmental Protection","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957582024010929","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate prediction of air quality is crucial for ensuring the scientific validity and effectiveness of air pollution control measures. This study proposes a combined deep learning (DL) model (XGBoost-GDA-TCN-IMRFO-GRU) for predicting hourly air quality index (AQI) data in four cities. The model integrates Extreme gradient boosting (XGBoost) for feature selection, Gaussian data augmentation (GDA), improved manta ray foraging optimization (IMRFO) algorithm, temporal convolutional network (TCN), and gated recurrent unit (GRU). XGBoost calculates the scores of pollutant gases affecting AQI, selecting the top four important pollutants (PM2.5, PM10, NO2, O3) based on their importance rankings. GDA enhances the robustness of the DL models and addresses the limitations of insufficient and overfitting training datasets. Additionally, the IMRFO algorithm, with two improved strategies, is applied to enhance the GRU model. TCN extracts spatiotemporal features of AQI, while GRU constructs a temporal model for efficient computations. Compared to eleven benchmark models, the proposed model demonstrates superior performance in terms of MAE, RMSE, MAPE, and NSE, achieving high accuracy and optimal prediction performance. Specifically, the XGBoost-GDA-TCN-IMRFO-GRU model reduces RMSE, MAE, and MAPE by 33–60 %, 39–68 %, and 39–66 %, respectively, compared to the TCN model. Therefore, the XGBoost-GDA-TCN-IMRFO-GRU model can provide reliable early warnings for air quality, which is of great significance for air pollution prevention and the sustainable development of society.
期刊介绍:
The Process Safety and Environmental Protection (PSEP) journal is a leading international publication that focuses on the publication of high-quality, original research papers in the field of engineering, specifically those related to the safety of industrial processes and environmental protection. The journal encourages submissions that present new developments in safety and environmental aspects, particularly those that show how research findings can be applied in process engineering design and practice.
PSEP is particularly interested in research that brings fresh perspectives to established engineering principles, identifies unsolved problems, or suggests directions for future research. The journal also values contributions that push the boundaries of traditional engineering and welcomes multidisciplinary papers.
PSEP's articles are abstracted and indexed by a range of databases and services, which helps to ensure that the journal's research is accessible and recognized in the academic and professional communities. These databases include ANTE, Chemical Abstracts, Chemical Hazards in Industry, Current Contents, Elsevier Engineering Information database, Pascal Francis, Web of Science, Scopus, Engineering Information Database EnCompass LIT (Elsevier), and INSPEC. This wide coverage facilitates the dissemination of the journal's content to a global audience interested in process safety and environmental engineering.