{"title":"Spatiotemporal stacking method with daily-cycle restrictions for reconstructing missing hourly PM2.5 records","authors":"Chuanfa Chen, Kunyu Li","doi":"10.1111/tgis.13141","DOIUrl":null,"url":null,"abstract":"The reliability of hourly PM2.5 data obtained from air quality monitoring stations is compromised as a result of the missing values, thereby impeding the thorough examination of crucial information. In this paper, we present a spatiotemporal (ST) stacking machine learning (ML) method with daily-cycle restrictions for reconstructing missing hourly PM2.5 records. First, the ST neighbors for the target station with missing values are selected at a daily scale. Subsequently, the non-null data within the ST neighbors undergo an iterative P-BSHADE interpolation process for re-interpolation. Next, a stacking ML model is constructed using the re-interpolation values and several environmental factors associated with PM2.5 as the predictors, while the observed PM2.5 is taken as the independent variable. Finally, the missing values are reconstructed by inputting the predictors into the trained stacking model. The study utilized hourly PM2.5 data in the Beijing-Tianjin-Hebei region as a case study to assess the effectiveness of the proposed method, using daily missing ratios of 10%, 30%, and 50%, respectively. The accuracy of the proposed method was then compared to four contemporary ST interpolation methods. The results indicate that the proposed method exhibits superior performance compared to the classical methods. Specifically, it achieves a reduction in the average root mean square error and mean absolute error by at least 40.6% and 40.1%, respectively. Additionally, the proposed method demonstrates the successful recovery of extreme values in the hourly PM2.5 records, in contrast to the classical methods which often exhibit a tendency to overestimate low values and underestimate high values. Overall, the proposed method presents a viable and efficient approach to recover missing values in the hourly PM2.5 records that demonstrate evident daily periodic patterns.","PeriodicalId":47842,"journal":{"name":"Transactions in GIS","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions in GIS","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1111/tgis.13141","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOGRAPHY","Score":null,"Total":0}
引用次数: 0
Abstract
The reliability of hourly PM2.5 data obtained from air quality monitoring stations is compromised as a result of the missing values, thereby impeding the thorough examination of crucial information. In this paper, we present a spatiotemporal (ST) stacking machine learning (ML) method with daily-cycle restrictions for reconstructing missing hourly PM2.5 records. First, the ST neighbors for the target station with missing values are selected at a daily scale. Subsequently, the non-null data within the ST neighbors undergo an iterative P-BSHADE interpolation process for re-interpolation. Next, a stacking ML model is constructed using the re-interpolation values and several environmental factors associated with PM2.5 as the predictors, while the observed PM2.5 is taken as the independent variable. Finally, the missing values are reconstructed by inputting the predictors into the trained stacking model. The study utilized hourly PM2.5 data in the Beijing-Tianjin-Hebei region as a case study to assess the effectiveness of the proposed method, using daily missing ratios of 10%, 30%, and 50%, respectively. The accuracy of the proposed method was then compared to four contemporary ST interpolation methods. The results indicate that the proposed method exhibits superior performance compared to the classical methods. Specifically, it achieves a reduction in the average root mean square error and mean absolute error by at least 40.6% and 40.1%, respectively. Additionally, the proposed method demonstrates the successful recovery of extreme values in the hourly PM2.5 records, in contrast to the classical methods which often exhibit a tendency to overestimate low values and underestimate high values. Overall, the proposed method presents a viable and efficient approach to recover missing values in the hourly PM2.5 records that demonstrate evident daily periodic patterns.
期刊介绍:
Transactions in GIS is an international journal which provides a forum for high quality, original research articles, review articles, short notes and book reviews that focus on: - practical and theoretical issues influencing the development of GIS - the collection, analysis, modelling, interpretation and display of spatial data within GIS - the connections between GIS and related technologies - new GIS applications which help to solve problems affecting the natural or built environments, or business