{"title":"Reconstructing long-term (1980–2022) daily ground particulate matter concentrations in India (LongPMInd)","authors":"Shuai Wang, Mengyuan Zhang, Hui Zhao, Peng Wang, S. Kota, Qingyan Fu, Cong Liu, Hongliang Zhang","doi":"10.5194/essd-16-3565-2024","DOIUrl":null,"url":null,"abstract":"Abstract. Severe airborne particulate matter (PM, including PM2.5 and PM10) pollution in India has caused widespread concern. Accurate PM concentrations are fundamental for scientific policymaking and health impact assessment, while surface observations in India are limited due to scarce sites and uneven distribution. In this work, a simple structured, efficient, and robust model based on the Light Gradient-Boosting Machine (LightGBM) was developed to fuse multisource data and estimate long-term (1980–2022) historical daily ground PM concentrations in India (LongPMInd). The LightGBM model shows good accuracy with out-of-sample, out-of-site, and out-of-year cross-validation (CV) test R2 values of 0.77, 0.70, and 0.66, respectively. Small performance gaps between PM2.5 training and testing (delta RMSE of 1.06, 3.83, and 7.74 µg m−3) indicate low overfitting risks. With great generalization ability, the openly accessible, long-term, and high-quality daily PM2.5 and PM10 products were then reconstructed (10 km, 1980–2022). This showed that India has experienced severe PM pollution in the Indo-Gangetic Plain (IGP), especially in winter. PM concentrations have significantly increased (p<0.05) in most regions since 2000 (0.34 µgm-3yr-1). The turning point occurred in 2018 when the Indian government launched the National Clean Air Programme, and PM2.5 concentrations declined in most regions (−0.78 µgm-3yr-1) during 2018–2022. Severe PM2.5 pollution caused continuous increased attributable premature mortalities, from 0.73 (95 % confidence interval (CI) [0.65, 0.80]) million in 2000 to 1.22 (95 % CI [1.03, 1.41]) million in 2019, particularly in the IGP, where attributable mortality increased from 0.36 million to 0.60 million. LongPMInd has the potential to support multiple applications of air quality management, public health initiatives, and efforts to address climate change. The daily and monthly PM2.5 and PM10 concentrations are publicly accessible at https://doi.org/10.5281/zenodo.10073944 (Wang et al., 2023a).\n","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":null,"pages":null},"PeriodicalIF":11.2000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth System Science Data","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.5194/essd-16-3565-2024","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract. Severe airborne particulate matter (PM, including PM2.5 and PM10) pollution in India has caused widespread concern. Accurate PM concentrations are fundamental for scientific policymaking and health impact assessment, while surface observations in India are limited due to scarce sites and uneven distribution. In this work, a simple structured, efficient, and robust model based on the Light Gradient-Boosting Machine (LightGBM) was developed to fuse multisource data and estimate long-term (1980–2022) historical daily ground PM concentrations in India (LongPMInd). The LightGBM model shows good accuracy with out-of-sample, out-of-site, and out-of-year cross-validation (CV) test R2 values of 0.77, 0.70, and 0.66, respectively. Small performance gaps between PM2.5 training and testing (delta RMSE of 1.06, 3.83, and 7.74 µg m−3) indicate low overfitting risks. With great generalization ability, the openly accessible, long-term, and high-quality daily PM2.5 and PM10 products were then reconstructed (10 km, 1980–2022). This showed that India has experienced severe PM pollution in the Indo-Gangetic Plain (IGP), especially in winter. PM concentrations have significantly increased (p<0.05) in most regions since 2000 (0.34 µgm-3yr-1). The turning point occurred in 2018 when the Indian government launched the National Clean Air Programme, and PM2.5 concentrations declined in most regions (−0.78 µgm-3yr-1) during 2018–2022. Severe PM2.5 pollution caused continuous increased attributable premature mortalities, from 0.73 (95 % confidence interval (CI) [0.65, 0.80]) million in 2000 to 1.22 (95 % CI [1.03, 1.41]) million in 2019, particularly in the IGP, where attributable mortality increased from 0.36 million to 0.60 million. LongPMInd has the potential to support multiple applications of air quality management, public health initiatives, and efforts to address climate change. The daily and monthly PM2.5 and PM10 concentrations are publicly accessible at https://doi.org/10.5281/zenodo.10073944 (Wang et al., 2023a).
Earth System Science DataGEOSCIENCES, MULTIDISCIPLINARYMETEOROLOGY-METEOROLOGY & ATMOSPHERIC SCIENCES
CiteScore
18.00
自引率
5.30%
发文量
231
审稿时长
35 weeks
期刊介绍:
Earth System Science Data (ESSD) is an international, interdisciplinary journal that publishes articles on original research data in order to promote the reuse of high-quality data in the field of Earth system sciences. The journal welcomes submissions of original data or data collections that meet the required quality standards and have the potential to contribute to the goals of the journal. It includes sections dedicated to regular-length articles, brief communications (such as updates to existing data sets), commentaries, review articles, and special issues. ESSD is abstracted and indexed in several databases, including Science Citation Index Expanded, Current Contents/PCE, Scopus, ADS, CLOCKSS, CNKI, DOAJ, EBSCO, Gale/Cengage, GoOA (CAS), and Google Scholar, among others.