通过聚类平均采样策略改进全球土壤水分预测

IF 5.6 1区 农林科学 Q1 SOIL SCIENCE
{"title":"通过聚类平均采样策略改进全球土壤水分预测","authors":"","doi":"10.1016/j.geoderma.2024.116999","DOIUrl":null,"url":null,"abstract":"<div><p>Understanding and predicting global soil moisture (SM) is crucial for water resource management and agricultural production. While deep learning methods (DL) have shown strong performance in SM prediction, imbalances in training samples with different characteristics pose a significant challenge. We propose that improving the diversity and balance of batch training samples during gradient descent can help address this issue. To test this hypothesis, we developed a Cluster-Averaged Sampling (CAS) strategy utilizing unsupervised learning techniques. This approach involves training the model with evenly sampled data from different clusters, ensuring both sample diversity and numerical consistency within each cluster. This approach prevents the model from overemphasizing specific sample characteristics, leading to more balanced feature learning. Experiments using the LandBench1.0 dataset with five different seeds for 1-day lead-time global predictions reveal that CAS outperforms several Long Short-Term Memory (LSTM)-based models that do not employ this strategy. The median Coefficient of Determination (R<sup>2</sup>) improved by 2.36 % to 4.31 %, while Kling-Gupta Efficiency (KGE) improved by 1.95 % to 3.16 %. In high-latitude areas, R<sup>2</sup> improvements exceeded 40 % in specific regions. To further validate CAS under realistic conditions, we tested it using the Soil Moisture Active and Passive Level 3 (SMAP-L3) satellite data for 1 to 3-day lead-time global predictions, confirming its efficacy. The study substantiates the CAS strategy and introduces a novel training method for enhancing the generalization of DL models.</p></div>","PeriodicalId":12511,"journal":{"name":"Geoderma","volume":null,"pages":null},"PeriodicalIF":5.6000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0016706124002283/pdfft?md5=7b7a5fc5b0181bfd9cd70f884cf867ba&pid=1-s2.0-S0016706124002283-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Improving global soil moisture prediction through cluster-averaged sampling strategy\",\"authors\":\"\",\"doi\":\"10.1016/j.geoderma.2024.116999\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Understanding and predicting global soil moisture (SM) is crucial for water resource management and agricultural production. While deep learning methods (DL) have shown strong performance in SM prediction, imbalances in training samples with different characteristics pose a significant challenge. We propose that improving the diversity and balance of batch training samples during gradient descent can help address this issue. To test this hypothesis, we developed a Cluster-Averaged Sampling (CAS) strategy utilizing unsupervised learning techniques. This approach involves training the model with evenly sampled data from different clusters, ensuring both sample diversity and numerical consistency within each cluster. This approach prevents the model from overemphasizing specific sample characteristics, leading to more balanced feature learning. Experiments using the LandBench1.0 dataset with five different seeds for 1-day lead-time global predictions reveal that CAS outperforms several Long Short-Term Memory (LSTM)-based models that do not employ this strategy. The median Coefficient of Determination (R<sup>2</sup>) improved by 2.36 % to 4.31 %, while Kling-Gupta Efficiency (KGE) improved by 1.95 % to 3.16 %. In high-latitude areas, R<sup>2</sup> improvements exceeded 40 % in specific regions. To further validate CAS under realistic conditions, we tested it using the Soil Moisture Active and Passive Level 3 (SMAP-L3) satellite data for 1 to 3-day lead-time global predictions, confirming its efficacy. The study substantiates the CAS strategy and introduces a novel training method for enhancing the generalization of DL models.</p></div>\",\"PeriodicalId\":12511,\"journal\":{\"name\":\"Geoderma\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0016706124002283/pdfft?md5=7b7a5fc5b0181bfd9cd70f884cf867ba&pid=1-s2.0-S0016706124002283-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Geoderma\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0016706124002283\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOIL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoderma","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016706124002283","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOIL SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

了解和预测全球土壤湿度(SM)对于水资源管理和农业生产至关重要。虽然深度学习方法(DL)在土壤湿度预测方面表现出了很强的性能,但具有不同特征的训练样本的不平衡性带来了巨大的挑战。我们提出,在梯度下降过程中改善批量训练样本的多样性和平衡性有助于解决这一问题。为了验证这一假设,我们利用无监督学习技术开发了集群平均采样(CAS)策略。这种方法是用来自不同集群的均匀采样数据来训练模型,确保每个集群内的样本多样性和数值一致性。这种方法可以防止模型过分强调特定样本的特征,从而实现更均衡的特征学习。使用 LandBench1.0 数据集和五种不同的种子进行 1 天提前期全局预测的实验表明,CAS 的表现优于未采用这种策略的几种基于长短期记忆(LSTM)的模型。中位判定系数(R)提高了 2.36 %,达到 4.31 %;Kling-Gupta 效率(KGE)提高了 1.95 %,达到 3.16 %。在高纬度地区,特定区域的 R 提高了 40% 以上。为了在现实条件下进一步验证 CAS,我们使用土壤水分主动和被动三级(SMAP-L3)卫星数据对其进行了 1 至 3 天提前期全球预测测试,证实了其功效。这项研究证实了 CAS 策略,并引入了一种新的训练方法来增强 DL 模型的泛化能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving global soil moisture prediction through cluster-averaged sampling strategy

Understanding and predicting global soil moisture (SM) is crucial for water resource management and agricultural production. While deep learning methods (DL) have shown strong performance in SM prediction, imbalances in training samples with different characteristics pose a significant challenge. We propose that improving the diversity and balance of batch training samples during gradient descent can help address this issue. To test this hypothesis, we developed a Cluster-Averaged Sampling (CAS) strategy utilizing unsupervised learning techniques. This approach involves training the model with evenly sampled data from different clusters, ensuring both sample diversity and numerical consistency within each cluster. This approach prevents the model from overemphasizing specific sample characteristics, leading to more balanced feature learning. Experiments using the LandBench1.0 dataset with five different seeds for 1-day lead-time global predictions reveal that CAS outperforms several Long Short-Term Memory (LSTM)-based models that do not employ this strategy. The median Coefficient of Determination (R2) improved by 2.36 % to 4.31 %, while Kling-Gupta Efficiency (KGE) improved by 1.95 % to 3.16 %. In high-latitude areas, R2 improvements exceeded 40 % in specific regions. To further validate CAS under realistic conditions, we tested it using the Soil Moisture Active and Passive Level 3 (SMAP-L3) satellite data for 1 to 3-day lead-time global predictions, confirming its efficacy. The study substantiates the CAS strategy and introduces a novel training method for enhancing the generalization of DL models.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Geoderma
Geoderma 农林科学-土壤科学
CiteScore
11.80
自引率
6.60%
发文量
597
审稿时长
58 days
期刊介绍: Geoderma - the global journal of soil science - welcomes authors, readers and soil research from all parts of the world, encourages worldwide soil studies, and embraces all aspects of soil science and its associated pedagogy. The journal particularly welcomes interdisciplinary work focusing on dynamic soil processes and functions across space and time.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信