Blending daily satellite precipitation product and rain gauges using stacking ensemble machine learning with the consideration of spatial heterogeneity
{"title":"Blending daily satellite precipitation product and rain gauges using stacking ensemble machine learning with the consideration of spatial heterogeneity","authors":"Chuanfa Chen, Jinda Hao, Shufan Yang, Yanyan Li","doi":"10.1016/j.jhydrol.2025.133223","DOIUrl":null,"url":null,"abstract":"<div><div>Blending satellite precipitation products (SPPs) with rain gauge observations through machine learning (ML)-based methods offers a proficient means of achieving high-accuracy precipitation data. However, traditional ML methods often neglect the spatial heterogeneity of precipitation across the study area, and the unique strengths of individual ML models remain underutilized. To address these challenges, this paper proposes a stacking ensemble learning approach that accounts for spatial heterogeneity for blending SPPs with rain gauge data to produce highly accurate precipitation estimates. Specifically, the study area is segmented into several homogeneous zones to mitigate spatial heterogeneity, with each grid cell within these zones assigned a uniform identifier (ID). Furthermore, a stacking ensemble ML framework which takes the ID as an input feature is developed to merge SPPs and rain gauge observations. To evaluate the performance of our proposed method, we blended daily IMERG data and rain gauge observations spanning from 2016 to 2020 across the Chinese mainland, benchmarking it against seven ML methods and the original IMERG data. The experimental results provide several key insights: (i) Data-driven adaptive clustering emerges as an efficient tool for addressing the challenge of spatial heterogeneity in high-quality precipitation estimation. (ii) Across multiple temporal scales, the proposed method outperforms the classical ML-based methods. Notably, at the daily scale, it improves upon the classical approaches by at least 2.4 % in Mean Absolute Error (MAE), 0.76 % in Root Mean Square Error (RMSE), 1.4 % in Correlation Coefficient (CC), and 1.4 % in Kling-Gupta Efficiency (KGE). Furthermore, at the monthly and seasonal scales, it reduces MAE by at least 2.3 % and 2.8 %, respectively, and enhances KGE by at least 0.9 % and 1.1 %. (iii) The spatial distribution of precipitation estimated by the proposed method aligns more closely with rain gauge observations compared to the classical methods. (iv) The ID feature plays a crucial role in precipitation estimation, ranking first and second in terms of feature importance for 39.6 % and 33.9 % of days, respectively, over the five-year period. (v) The proposed method generates positive incremental values at 69 % of rain gauge stations, demonstrating greater added value compared to the classical methods. Overall, the proposed method can be regarded as an effective tool for generating high-accuracy daily precipitation products.</div></div>","PeriodicalId":362,"journal":{"name":"Journal of Hydrology","volume":"658 ","pages":"Article 133223"},"PeriodicalIF":5.9000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydrology","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S002216942500561X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
Blending satellite precipitation products (SPPs) with rain gauge observations through machine learning (ML)-based methods offers a proficient means of achieving high-accuracy precipitation data. However, traditional ML methods often neglect the spatial heterogeneity of precipitation across the study area, and the unique strengths of individual ML models remain underutilized. To address these challenges, this paper proposes a stacking ensemble learning approach that accounts for spatial heterogeneity for blending SPPs with rain gauge data to produce highly accurate precipitation estimates. Specifically, the study area is segmented into several homogeneous zones to mitigate spatial heterogeneity, with each grid cell within these zones assigned a uniform identifier (ID). Furthermore, a stacking ensemble ML framework which takes the ID as an input feature is developed to merge SPPs and rain gauge observations. To evaluate the performance of our proposed method, we blended daily IMERG data and rain gauge observations spanning from 2016 to 2020 across the Chinese mainland, benchmarking it against seven ML methods and the original IMERG data. The experimental results provide several key insights: (i) Data-driven adaptive clustering emerges as an efficient tool for addressing the challenge of spatial heterogeneity in high-quality precipitation estimation. (ii) Across multiple temporal scales, the proposed method outperforms the classical ML-based methods. Notably, at the daily scale, it improves upon the classical approaches by at least 2.4 % in Mean Absolute Error (MAE), 0.76 % in Root Mean Square Error (RMSE), 1.4 % in Correlation Coefficient (CC), and 1.4 % in Kling-Gupta Efficiency (KGE). Furthermore, at the monthly and seasonal scales, it reduces MAE by at least 2.3 % and 2.8 %, respectively, and enhances KGE by at least 0.9 % and 1.1 %. (iii) The spatial distribution of precipitation estimated by the proposed method aligns more closely with rain gauge observations compared to the classical methods. (iv) The ID feature plays a crucial role in precipitation estimation, ranking first and second in terms of feature importance for 39.6 % and 33.9 % of days, respectively, over the five-year period. (v) The proposed method generates positive incremental values at 69 % of rain gauge stations, demonstrating greater added value compared to the classical methods. Overall, the proposed method can be regarded as an effective tool for generating high-accuracy daily precipitation products.
期刊介绍:
The Journal of Hydrology publishes original research papers and comprehensive reviews in all the subfields of the hydrological sciences including water based management and policy issues that impact on economics and society. These comprise, but are not limited to the physical, chemical, biogeochemical, stochastic and systems aspects of surface and groundwater hydrology, hydrometeorology and hydrogeology. Relevant topics incorporating the insights and methodologies of disciplines such as climatology, water resource systems, hydraulics, agrohydrology, geomorphology, soil science, instrumentation and remote sensing, civil and environmental engineering are included. Social science perspectives on hydrological problems such as resource and ecological economics, environmental sociology, psychology and behavioural science, management and policy analysis are also invited. Multi-and interdisciplinary analyses of hydrological problems are within scope. The science published in the Journal of Hydrology is relevant to catchment scales rather than exclusively to a local scale or site.