Nannan Wang , Zijian Yue , Yaolin Liu , Zhaomin Tong , Yanfang Liu , Yanchi Lu , Yongge Shi
{"title":"Variability and uncertainty in net ecosystem carbon exchange modeling: Systematic estimates at global flux sites via ensemble machine learning","authors":"Nannan Wang , Zijian Yue , Yaolin Liu , Zhaomin Tong , Yanfang Liu , Yanchi Lu , Yongge Shi","doi":"10.1016/j.agrformet.2025.110784","DOIUrl":null,"url":null,"abstract":"<div><div>Predicting net ecosystem carbon exchange (NEE) is crucial for understanding carbon dynamics. Machine learning (ML) has become pivotal for site-level modeling and spatial upscaling for NEE, yet spatiotemporal variability and uncertainty challenge its reliability and universality. Systematically quantifying variability and uncertainty sources in NEE modeling remains lacking due to the scale-dependent nature of carbon flux variations. Thus, this study established a systematic framework to evaluate how model construction choices and environmental predictors could impact ML-based NEE modeling across timescales with multifaceted evaluation criteria. Using observations from FLUXNET 2015, AmeriFlux, and ICOS, alongside multi-source data, this study conducted separate models for each combination of four timescales (daily, weekly, monthly, and yearly), four tree-based ensemble algorithms, and three data-splitting rules. Multi-faceted assessment included overall, across-site, seasonal, and anomaly perspectives. Key findings include: (1) <em>Model construction.</em> Boosting (LightGBM, XGBoost, and CatBoost) excelled in capturing temporal variability and anomaly, whereas bagging (Random Forest) was effective for spatial variability. Complete-random data splitting increased overfitting risks and should be avoided. (2) <em>Predictors.</em> Environmental controls on accuracy varied with timescales, data situations, and ambient conditions. Predictors for NEE modeling should be selected based on their causal importance (e.g., evapotranspiration, vapor pressure deficit, and air temperature) and statistical relationships (e.g., leaf area index, elevation, and precipitation) with NEE, tailored to specific ambient conditions. Excessive predictors may degrade NEE prediction accuracy, particularly at large scales or in regions with high environment like arid areas. (3) <em>Evaluation criteria.</em> Rigorous multi-metric accuracy assessments proved essential, as reliance on single metrics or overall accuracy could yield contradictory results. For instance, daily models achieved higher anomaly NSE (0.33 vs. 0.25) but lower overall NSE (0.54 vs. 0.59) than monthly models. NEE predictions exhibited greater challenges in accounting for spatial than temporal variability, resulting in lower accuracy for inter-annual than intra-annual predictions. This study advances ML-driven carbon flux modeling with actionable insights.</div></div>","PeriodicalId":50839,"journal":{"name":"Agricultural and Forest Meteorology","volume":"374 ","pages":"Article 110784"},"PeriodicalIF":5.7000,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agricultural and Forest Meteorology","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168192325004034","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
Predicting net ecosystem carbon exchange (NEE) is crucial for understanding carbon dynamics. Machine learning (ML) has become pivotal for site-level modeling and spatial upscaling for NEE, yet spatiotemporal variability and uncertainty challenge its reliability and universality. Systematically quantifying variability and uncertainty sources in NEE modeling remains lacking due to the scale-dependent nature of carbon flux variations. Thus, this study established a systematic framework to evaluate how model construction choices and environmental predictors could impact ML-based NEE modeling across timescales with multifaceted evaluation criteria. Using observations from FLUXNET 2015, AmeriFlux, and ICOS, alongside multi-source data, this study conducted separate models for each combination of four timescales (daily, weekly, monthly, and yearly), four tree-based ensemble algorithms, and three data-splitting rules. Multi-faceted assessment included overall, across-site, seasonal, and anomaly perspectives. Key findings include: (1) Model construction. Boosting (LightGBM, XGBoost, and CatBoost) excelled in capturing temporal variability and anomaly, whereas bagging (Random Forest) was effective for spatial variability. Complete-random data splitting increased overfitting risks and should be avoided. (2) Predictors. Environmental controls on accuracy varied with timescales, data situations, and ambient conditions. Predictors for NEE modeling should be selected based on their causal importance (e.g., evapotranspiration, vapor pressure deficit, and air temperature) and statistical relationships (e.g., leaf area index, elevation, and precipitation) with NEE, tailored to specific ambient conditions. Excessive predictors may degrade NEE prediction accuracy, particularly at large scales or in regions with high environment like arid areas. (3) Evaluation criteria. Rigorous multi-metric accuracy assessments proved essential, as reliance on single metrics or overall accuracy could yield contradictory results. For instance, daily models achieved higher anomaly NSE (0.33 vs. 0.25) but lower overall NSE (0.54 vs. 0.59) than monthly models. NEE predictions exhibited greater challenges in accounting for spatial than temporal variability, resulting in lower accuracy for inter-annual than intra-annual predictions. This study advances ML-driven carbon flux modeling with actionable insights.
期刊介绍:
Agricultural and Forest Meteorology is an international journal for the publication of original articles and reviews on the inter-relationship between meteorology, agriculture, forestry, and natural ecosystems. Emphasis is on basic and applied scientific research relevant to practical problems in the field of plant and soil sciences, ecology and biogeochemistry as affected by weather as well as climate variability and change. Theoretical models should be tested against experimental data. Articles must appeal to an international audience. Special issues devoted to single topics are also published.
Typical topics include canopy micrometeorology (e.g. canopy radiation transfer, turbulence near the ground, evapotranspiration, energy balance, fluxes of trace gases), micrometeorological instrumentation (e.g., sensors for trace gases, flux measurement instruments, radiation measurement techniques), aerobiology (e.g. the dispersion of pollen, spores, insects and pesticides), biometeorology (e.g. the effect of weather and climate on plant distribution, crop yield, water-use efficiency, and plant phenology), forest-fire/weather interactions, and feedbacks from vegetation to weather and the climate system.