A multimodal machine learning fused global 0.1° daily evapotranspiration dataset from 1950-2022

IF 5.7 1区 农林科学 Q1 AGRONOMY
Qingchen Xu , Lu Li , Zhongwang Wei , Xingjie Lu , Nan Wei , Xuhui Lee , Yongjiu Dai
{"title":"A multimodal machine learning fused global 0.1° daily evapotranspiration dataset from 1950-2022","authors":"Qingchen Xu ,&nbsp;Lu Li ,&nbsp;Zhongwang Wei ,&nbsp;Xingjie Lu ,&nbsp;Nan Wei ,&nbsp;Xuhui Lee ,&nbsp;Yongjiu Dai","doi":"10.1016/j.agrformet.2025.110645","DOIUrl":null,"url":null,"abstract":"<div><div>Evapotranspiration (ET) is the second largest hydrological flux over the land surface and connects water, energy, and carbon cycles. However, large uncertainties exist among current ET products due to their coarse spatial resolutions, short temporal coverages, and reliance on assumptions. This study introduces a multimodal machine learning framework to generate a high-resolution (0.1°, daily), long-term (1950–2022) global ET dataset by fusing 13 state-of-the-art ET products encompassing remote sensing, machine learning, land surface models, and reanalysis data relying on extensive flux tower observations (462 sites). The framework reconstructs the individual ET products to consistent spatiotemporal resolutions and time ranges using Light Gradient Boosting Machine (LightGBM) models, and the Automated Machine Learning (AutoML) technique was used to fuse ET using 13 reconstructed ET products, ERA5-land atmospheric forcings and ancillary data as predictors. In-situ observations are utilized for model training and validation. Results demonstrate significant improvements over existing datasets, with our product achieving the highest accuracy (KGE = 0.857, RMSE = 0.726 mm/day) against in situ measurements across ecosystems and regions. The fused ET dataset realistically captures spatiotemporal variability and corrects the systematic underestimation bias prevalent in other datasets, particularly in wet regions. This novel high spatial-temporal ET dataset enables more robust assessments for water, energy, and carbon cycle applications on regional hydrology and ecology. The introduced data integration methodology also provides a valuable framework for fusing multiple geoscience datasets with disparate properties.</div></div>","PeriodicalId":50839,"journal":{"name":"Agricultural and Forest Meteorology","volume":"372 ","pages":"Article 110645"},"PeriodicalIF":5.7000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agricultural and Forest Meteorology","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168192325002655","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0

Abstract

Evapotranspiration (ET) is the second largest hydrological flux over the land surface and connects water, energy, and carbon cycles. However, large uncertainties exist among current ET products due to their coarse spatial resolutions, short temporal coverages, and reliance on assumptions. This study introduces a multimodal machine learning framework to generate a high-resolution (0.1°, daily), long-term (1950–2022) global ET dataset by fusing 13 state-of-the-art ET products encompassing remote sensing, machine learning, land surface models, and reanalysis data relying on extensive flux tower observations (462 sites). The framework reconstructs the individual ET products to consistent spatiotemporal resolutions and time ranges using Light Gradient Boosting Machine (LightGBM) models, and the Automated Machine Learning (AutoML) technique was used to fuse ET using 13 reconstructed ET products, ERA5-land atmospheric forcings and ancillary data as predictors. In-situ observations are utilized for model training and validation. Results demonstrate significant improvements over existing datasets, with our product achieving the highest accuracy (KGE = 0.857, RMSE = 0.726 mm/day) against in situ measurements across ecosystems and regions. The fused ET dataset realistically captures spatiotemporal variability and corrects the systematic underestimation bias prevalent in other datasets, particularly in wet regions. This novel high spatial-temporal ET dataset enables more robust assessments for water, energy, and carbon cycle applications on regional hydrology and ecology. The introduced data integration methodology also provides a valuable framework for fusing multiple geoscience datasets with disparate properties.
多模态机器学习融合了1950-2022年全球0.1°日蒸散数据集
蒸散发(ET)是陆地表面第二大水文通量,连接着水、能量和碳循环。然而,目前的ET产品由于其粗糙的空间分辨率、较短的时间覆盖和对假设的依赖,存在很大的不确定性。本研究引入了一个多模态机器学习框架,通过融合13种最先进的ET产品,包括遥感、机器学习、地表模型和依赖于广泛通量塔观测(462个站点)的再分析数据,生成高分辨率(每天0.1°)、长期(1950-2022)全球ET数据集。该框架使用光梯度增强机(Light Gradient Boosting Machine, LightGBM)模型将单个ET产品重构为一致的时空分辨率和时间范围,并使用自动机器学习(AutoML)技术将13个重建ET产品、era5陆地大气强迫和辅助数据作为预测因子进行ET融合。利用现场观测进行模型训练和验证。结果表明,与现有数据集相比,我们的产品在生态系统和地区的原位测量中获得了最高的精度(KGE = 0.857, RMSE = 0.726 mm/day)。融合的ET数据集真实地捕获了时空变异性,并纠正了其他数据集中普遍存在的系统性低估偏差,特别是在潮湿地区。这种新型的高时空ET数据集能够更可靠地评估水、能源和碳循环在区域水文和生态中的应用。引入的数据集成方法还为融合具有不同属性的多个地球科学数据集提供了一个有价值的框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.30
自引率
9.70%
发文量
415
审稿时长
69 days
期刊介绍: Agricultural and Forest Meteorology is an international journal for the publication of original articles and reviews on the inter-relationship between meteorology, agriculture, forestry, and natural ecosystems. Emphasis is on basic and applied scientific research relevant to practical problems in the field of plant and soil sciences, ecology and biogeochemistry as affected by weather as well as climate variability and change. Theoretical models should be tested against experimental data. Articles must appeal to an international audience. Special issues devoted to single topics are also published. Typical topics include canopy micrometeorology (e.g. canopy radiation transfer, turbulence near the ground, evapotranspiration, energy balance, fluxes of trace gases), micrometeorological instrumentation (e.g., sensors for trace gases, flux measurement instruments, radiation measurement techniques), aerobiology (e.g. the dispersion of pollen, spores, insects and pesticides), biometeorology (e.g. the effect of weather and climate on plant distribution, crop yield, water-use efficiency, and plant phenology), forest-fire/weather interactions, and feedbacks from vegetation to weather and the climate system.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信