Using automated machine learning for the upscaling of gross primary productivity

IF 3.9 2区 地球科学 Q1 ECOLOGY
Max Gaber, Yanghui Kang, G. Schurgers, Trevor Keenan
{"title":"Using automated machine learning for the upscaling of gross primary productivity","authors":"Max Gaber, Yanghui Kang, G. Schurgers, Trevor Keenan","doi":"10.5194/bg-21-2447-2024","DOIUrl":null,"url":null,"abstract":"Abstract. Estimating gross primary productivity (GPP) over space and time is fundamental for understanding the response of the terrestrial biosphere to climate change. Eddy covariance flux towers provide in situ estimates of GPP at the ecosystem scale, but their sparse geographical distribution limits larger-scale inference. Machine learning (ML) techniques have been used to address this problem by extrapolating local GPP measurements over space using satellite remote sensing data. However, the accuracy of the regression model can be affected by uncertainties introduced by model selection, parameterization, and choice of explanatory features, among others. Recent advances in automated ML (AutoML) provide a novel automated way to select and synthesize different ML models. In this work, we explore the potential of AutoML by training three major AutoML frameworks on eddy covariance measurements of GPP at 243 globally distributed sites. We compared their ability to predict GPP and its spatial and temporal variability based on different sets of remote sensing explanatory variables. Explanatory variables from only Moderate Resolution Imaging Spectroradiometer (MODIS) surface reflectance data and photosynthetically active radiation explained over 70 % of the monthly variability in GPP, while satellite-derived proxies for canopy structure, photosynthetic activity, environmental stressors, and meteorological variables from reanalysis (ERA5-Land) further improved the frameworks' predictive ability. We found that the AutoML framework Auto-sklearn consistently outperformed other AutoML frameworks as well as a classical random forest regressor in predicting GPP but with small performance differences, reaching an r2 of up to 0.75. We deployed the best-performing framework to generate global wall-to-wall maps highlighting GPP patterns in good agreement with satellite-derived reference data. This research benchmarks the application of AutoML in GPP estimation and assesses its potential and limitations in quantifying global photosynthetic activity.\n","PeriodicalId":8899,"journal":{"name":"Biogeosciences","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biogeosciences","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.5194/bg-21-2447-2024","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract. Estimating gross primary productivity (GPP) over space and time is fundamental for understanding the response of the terrestrial biosphere to climate change. Eddy covariance flux towers provide in situ estimates of GPP at the ecosystem scale, but their sparse geographical distribution limits larger-scale inference. Machine learning (ML) techniques have been used to address this problem by extrapolating local GPP measurements over space using satellite remote sensing data. However, the accuracy of the regression model can be affected by uncertainties introduced by model selection, parameterization, and choice of explanatory features, among others. Recent advances in automated ML (AutoML) provide a novel automated way to select and synthesize different ML models. In this work, we explore the potential of AutoML by training three major AutoML frameworks on eddy covariance measurements of GPP at 243 globally distributed sites. We compared their ability to predict GPP and its spatial and temporal variability based on different sets of remote sensing explanatory variables. Explanatory variables from only Moderate Resolution Imaging Spectroradiometer (MODIS) surface reflectance data and photosynthetically active radiation explained over 70 % of the monthly variability in GPP, while satellite-derived proxies for canopy structure, photosynthetic activity, environmental stressors, and meteorological variables from reanalysis (ERA5-Land) further improved the frameworks' predictive ability. We found that the AutoML framework Auto-sklearn consistently outperformed other AutoML frameworks as well as a classical random forest regressor in predicting GPP but with small performance differences, reaching an r2 of up to 0.75. We deployed the best-performing framework to generate global wall-to-wall maps highlighting GPP patterns in good agreement with satellite-derived reference data. This research benchmarks the application of AutoML in GPP estimation and assesses its potential and limitations in quantifying global photosynthetic activity.
利用自动机器学习提升总初级生产力
摘要估算空间和时间上的总初级生产力(GPP)对于了解陆地生物圈对气候变化的响应至关重要。涡协方差通量塔可提供生态系统尺度的 GPP 原位估算值,但其稀疏的地理分布限制了更大尺度的推断。机器学习(ML)技术已被用于解决这一问题,即利用卫星遥感数据将当地的 GPP 测量值推断到空间上。然而,回归模型的准确性会受到模型选择、参数化和解释特征选择等带来的不确定性的影响。自动化 ML(AutoML)的最新进展为选择和合成不同的 ML 模型提供了一种新的自动化方法。在这项工作中,我们通过对全球 243 个站点的 GPP 涡度协方差测量结果训练三个主要的 AutoML 框架,探索了 AutoML 的潜力。我们比较了它们基于不同遥感解释变量集预测 GPP 及其时空变异性的能力。仅来自中分辨率成像分光仪(MODIS)表面反射率数据和光合有效辐射的解释变量就解释了 70% 以上的 GPP 月变异性,而来自卫星的冠层结构代理变量、光合活动、环境压力因素和来自再分析(ERA5-Land)的气象变量则进一步提高了框架的预测能力。我们发现,在预测 GPP 方面,AutoML 框架 Auto-sklearn 的表现始终优于其他 AutoML 框架和经典随机森林回归器,但性能差异较小,r2 高达 0.75。我们利用表现最佳的框架生成了全球墙到墙地图,突出显示了与卫星衍生参考数据高度一致的 GPP 模式。这项研究为 AutoML 在 GPP 估算中的应用设定了基准,并评估了其在量化全球光合作用活动方面的潜力和局限性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biogeosciences
Biogeosciences 环境科学-地球科学综合
CiteScore
8.60
自引率
8.20%
发文量
258
审稿时长
4.2 months
期刊介绍: Biogeosciences (BG) is an international scientific journal dedicated to the publication and discussion of research articles, short communications and review papers on all aspects of the interactions between the biological, chemical and physical processes in terrestrial or extraterrestrial life with the geosphere, hydrosphere and atmosphere. The objective of the journal is to cut across the boundaries of established sciences and achieve an interdisciplinary view of these interactions. Experimental, conceptual and modelling approaches are welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信