Qiyun Xiao , Qingliang Li , Lu Li , Cheng Zhang , Jinlong Zhu , Xiao Chen , Jing Wang , Wei Shangguan , Zhongwang Wei , Wenzong Dong , Yongjiu Dai
{"title":"A novel diversity-aware sampling method for global soil moisture prediction","authors":"Qiyun Xiao , Qingliang Li , Lu Li , Cheng Zhang , Jinlong Zhu , Xiao Chen , Jing Wang , Wei Shangguan , Zhongwang Wei , Wenzong Dong , Yongjiu Dai","doi":"10.1016/j.jhydrol.2025.133851","DOIUrl":null,"url":null,"abstract":"<div><div>Predicting global soil moisture (SM) is crucial for drought forecasting, agricultural management, and climate modeling. However, traditional deep learning (DL) methods often struggle with imbalanced sample distributions and limited spatial representation, which restrict their ability to accurately model SM patterns across diverse regions and time. Furthermore, variations in sample characteristics influenced by spatial proximity pose additional challenges in creating balanced and representative training datasets. To address these challenges, we propose a Diversity-Aware Sampling (DAS) strategy to enhance spatial representativeness and temporal diversity in training data. DAS enhances traditional sampling by grouping samples through clustering and categorizing each cluster into high, medium, and low uncertainty levels. This approach ensures each batch contains a balanced mix of samples across multiple grid points, improving coverage and representativeness. Applied to an LSTM-based model for 1- to 3-day global SM predictions, DAS achieved notable performance gains, increasing R2 by up to 8.39% and KGE by up to 6.38%, demonstrating improved accuracy and stability. Ground-based evaluations using China Meteorological Administration (CMA) station data for 5-day drought forecasting further validated DAS’s superiority over traditional methods. By improving the spatial and temporal representativeness of training samples, DAS enhances the generalization of deep learning models in geoscience applications. This robust framework offers significant advancements in global SM prediction and drought monitoring.</div></div>","PeriodicalId":362,"journal":{"name":"Journal of Hydrology","volume":"662 ","pages":"Article 133851"},"PeriodicalIF":6.3000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hydrology","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022169425011898","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
Predicting global soil moisture (SM) is crucial for drought forecasting, agricultural management, and climate modeling. However, traditional deep learning (DL) methods often struggle with imbalanced sample distributions and limited spatial representation, which restrict their ability to accurately model SM patterns across diverse regions and time. Furthermore, variations in sample characteristics influenced by spatial proximity pose additional challenges in creating balanced and representative training datasets. To address these challenges, we propose a Diversity-Aware Sampling (DAS) strategy to enhance spatial representativeness and temporal diversity in training data. DAS enhances traditional sampling by grouping samples through clustering and categorizing each cluster into high, medium, and low uncertainty levels. This approach ensures each batch contains a balanced mix of samples across multiple grid points, improving coverage and representativeness. Applied to an LSTM-based model for 1- to 3-day global SM predictions, DAS achieved notable performance gains, increasing R2 by up to 8.39% and KGE by up to 6.38%, demonstrating improved accuracy and stability. Ground-based evaluations using China Meteorological Administration (CMA) station data for 5-day drought forecasting further validated DAS’s superiority over traditional methods. By improving the spatial and temporal representativeness of training samples, DAS enhances the generalization of deep learning models in geoscience applications. This robust framework offers significant advancements in global SM prediction and drought monitoring.
期刊介绍:
The Journal of Hydrology publishes original research papers and comprehensive reviews in all the subfields of the hydrological sciences including water based management and policy issues that impact on economics and society. These comprise, but are not limited to the physical, chemical, biogeochemical, stochastic and systems aspects of surface and groundwater hydrology, hydrometeorology and hydrogeology. Relevant topics incorporating the insights and methodologies of disciplines such as climatology, water resource systems, hydraulics, agrohydrology, geomorphology, soil science, instrumentation and remote sensing, civil and environmental engineering are included. Social science perspectives on hydrological problems such as resource and ecological economics, environmental sociology, psychology and behavioural science, management and policy analysis are also invited. Multi-and interdisciplinary analyses of hydrological problems are within scope. The science published in the Journal of Hydrology is relevant to catchment scales rather than exclusively to a local scale or site.