多模型集合成功预测了复杂地貌土壤中的大气甲烷消耗量

Environmental Dynamics and Global Climate Change Pub Date : 2024-01-18 DOI:10.18822/edgcc625761

M. Glagolev, D. V. Il’yasov, A. Sabrekov, I. Terentieva, D. V. Karelin

{"title":"多模型集合成功预测了复杂地貌土壤中的大气甲烷消耗量","authors":"M. Glagolev, D. V. Il’yasov, A. Sabrekov, I. Terentieva, D. V. Karelin","doi":"10.18822/edgcc625761","DOIUrl":null,"url":null,"abstract":"Methane consumption by soils is a crucial component of the CH4 and carbon cycle. It is essential to thoroughly investigate CH4 uptake by soils, particularly considering its anticipated increase by the end of the century [Zhuang et al., 2013]. Numerous mathematical models, both empirical and detailed biogeochemical [Glagolev et al., 2023], have been developed to quantify methane consumption by soils from the atmosphere. These models are instrumental in handling spatio-temporal variability and can offer reliable estimates of regional and global methane consumption by soils. Furthermore, they enhance our comprehension of the physical and biological processes that influence methanotrophy intensity. Consequently, we can forecast the response of CH4 consumption by soil to global climate shifts [Murguia-Flores et al., 2018], especially since many models consider the effects of atmospheric CH4 concentration changes on methanotrophy and ecosystem type [Zhuang et al., 2013]. \nIn addition to the utilization of individual models, such as those cited by [Hagedorn et al., 2005; Glagolev et al., 2014; Ito et al., 2016; Silva et al., 2016], there has been extensive advancement in employing multiple models in an ensemble format. This approach aims to integrate as much a priori information as feasible [Lapko, 2002]. Throughout the 20th century, the concept of ensemble modeling evolved from merely drawing conclusions based on multiple independent experts (F. Sanders, 1963) to structured ensemble mathematical modeling [Hagedorn et al., 2005]. In this context, the term \"ensemble\" consistently refers to a collection containing more than one model. \nComplexities in describing the physiology and biochemistry of methanotrophic bacteria in natural environments [Bedard, Knowles, 1989; Hanson, Hanson, 1996; Belova et al., 2013; Oshkin et al., 2014] make it difficult to develop accurate biological models and determine their specific biokinetic parameters [Curry, 2007]. At the same time, broader and often empirical models, such as those by [Potter et al., 1996; Ridgwell et al., 1999; Curry, 2007; Murguia-Flores et al., 2018], demonstrate reasonable estimates of global methane consumption. Employing model ensembles could enhance accuracy, not just in global and large-scale modeling, but also at the granular level of local study sites. Nonetheless, ensemble modeling doesn't always ensure optimal outcomes, as all models within an ensemble might overlook a biological process or effect that significantly influences the dynamics of a real ecosystem [Ito et al., 2016]. For instance, no model considered anaerobic methane oxidation until this process was empirically identified [Xu et al., 2015]. Therefore, it's crucial to validate the realism of an ensemble against specific in situ data for every application. This study aimed to develop an ensemble model describing methane consumption by soils and to test its efficacy on a randomly selected study site. \nIn our research, we closely examined and replicated the algorithms of four soil methane consumption models: the modification by Glagolev, Filippov [2011] of Dörr et al. [1993], Curry's model [2007], the CH4 consumption block from the DLEM model [Tian et al., 2010], and the MeMo model excluding autochthonous CH4 sources [Murguia-Flores et al., 2018]. Using these, we developed an ensemble of four models. For experimental in situ data, we utilized field measurements from the Kursk region in Russia. Additionally, we introduced a method to average the ensemble model's prediction by assigning weight coefficients to each model. This approach acknowledges the idea that the total available information doubles every few years. Thus, newer models were given higher weights, while older ones received lower weights. \nThe model ensemble effectively predicted CH4 consumption based on in situ measurements, albeit with a notably broad confidence interval for the predictions. Notably, there was minimal variance between the standard averaging of model predictions and weighted averaging. As anticipated, individual models underperformed compared to the ensemble. We computed the Theil inconsistency coefficient for various types of means, such as quadratic mean, cubic mean, and biquadratic mean, among others [Gini, Barbensi, 1958], both for ensemble modeling results and individual models. The ensemble predictions, when averaged using diverse methods, yielded Theil inconsistency coefficients ranging from 0.156 to 0.267. The most favorable outcome (0.156) was derived from the power mean with a power index of 0.7. However, the power mean presents a challenge as its power index isn't predetermined but chosen to best fit the experimental data. A similar limitation exists for the exponential mean. While the experimental data allows for the selection of a parameter yielding a Theil coefficient of 0.157, pre-determining this optimal value (1.3) is not feasible. Regarding other estimations that don't necessitate selecting optimal parameters, it was surprising to find that one of the best results (Theil's coefficient = 0.166) came from the half-sum of extreme terms. Surprisingly, the median provided a less satisfactory result, with a Theil's coefficient of 0.222. \nThe merit of the ensemble approach stems from P.D. Thompson's 1977 observation, which he stated assertively: \"It is an indisputable fact that two or more inaccurate, but independent predictions of the same event can be combined in such a way that their \"combined\" forecast, on average, will be more accurate than any of these individual forecasts\" [Hagedorn et al., 2005]. Examining our ensemble of models through this lens reveals a limitation, as the condition of independence isn't fully satisfied. The models by Dörr et al. [1993], Curry [2007], and MeMo [Murguia-Flores et al., 2018] share underlying similarities and can be seen as part of a cohesive cluster. Only DLEM, crafted on entirely distinct principles, stands apart from these models. To enhance the ensemble's robustness in future iterations, the inclusion of genuinely independent models, such as a modified version of MDM [Zhuang et al., 2013] and the model by Ridgwell et al. [1999], is recommended. \nThe ensemble, comprising four models and implemented without specific parameter adjustments, effectively captured methane consumption across diverse sites in the Kursk region, such as fields and forests. On average, the relative simulation error for all these sites was 36%, with the experimental data displaying a variation of 26%. Notably, while the variation is modest for this dataset, methane absorption measurements generally tend to fluctuate by several tens of percent [Crill, 1991, Fig. 1; Ambus, Robertson, 2006, Fig. 3; Kleptsova et al., 2010; Glagolev et al., 2012]. Considering this broader perspective, the simulation error achieved is indeed favorable. \nUpon evaluating different methods for combining individual model results within the ensemble (specifically those methods that can be applied without prior parameter adjustments based on experimental data), it was found that the most straightforward operators yielded the best outcomes. This assessment was based on Theil's inequality coefficient criterion. Both the semi-sum of extreme terms and the arithmetic mean stood out in their performance. However, a significant drawback of the constructed ensemble is the extensive confidence interval for its predictions, averaging ±78% at a 90% probability level. We hypothesize that expanding the number of independent models within the ensemble could potentially narrow this interval.","PeriodicalId":336975,"journal":{"name":"Environmental Dynamics and Global Climate Change","volume":"109 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-model ensemble successfully predicted atmospheric methane consumption in soils across the complex landscape\",\"authors\":\"M. Glagolev, D. V. Il’yasov, A. Sabrekov, I. Terentieva, D. V. Karelin\",\"doi\":\"10.18822/edgcc625761\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Methane consumption by soils is a crucial component of the CH4 and carbon cycle. It is essential to thoroughly investigate CH4 uptake by soils, particularly considering its anticipated increase by the end of the century [Zhuang et al., 2013]. Numerous mathematical models, both empirical and detailed biogeochemical [Glagolev et al., 2023], have been developed to quantify methane consumption by soils from the atmosphere. These models are instrumental in handling spatio-temporal variability and can offer reliable estimates of regional and global methane consumption by soils. Furthermore, they enhance our comprehension of the physical and biological processes that influence methanotrophy intensity. Consequently, we can forecast the response of CH4 consumption by soil to global climate shifts [Murguia-Flores et al., 2018], especially since many models consider the effects of atmospheric CH4 concentration changes on methanotrophy and ecosystem type [Zhuang et al., 2013]. \\nIn addition to the utilization of individual models, such as those cited by [Hagedorn et al., 2005; Glagolev et al., 2014; Ito et al., 2016; Silva et al., 2016], there has been extensive advancement in employing multiple models in an ensemble format. This approach aims to integrate as much a priori information as feasible [Lapko, 2002]. Throughout the 20th century, the concept of ensemble modeling evolved from merely drawing conclusions based on multiple independent experts (F. Sanders, 1963) to structured ensemble mathematical modeling [Hagedorn et al., 2005]. In this context, the term \\\"ensemble\\\" consistently refers to a collection containing more than one model. \\nComplexities in describing the physiology and biochemistry of methanotrophic bacteria in natural environments [Bedard, Knowles, 1989; Hanson, Hanson, 1996; Belova et al., 2013; Oshkin et al., 2014] make it difficult to develop accurate biological models and determine their specific biokinetic parameters [Curry, 2007]. At the same time, broader and often empirical models, such as those by [Potter et al., 1996; Ridgwell et al., 1999; Curry, 2007; Murguia-Flores et al., 2018], demonstrate reasonable estimates of global methane consumption. Employing model ensembles could enhance accuracy, not just in global and large-scale modeling, but also at the granular level of local study sites. Nonetheless, ensemble modeling doesn't always ensure optimal outcomes, as all models within an ensemble might overlook a biological process or effect that significantly influences the dynamics of a real ecosystem [Ito et al., 2016]. For instance, no model considered anaerobic methane oxidation until this process was empirically identified [Xu et al., 2015]. Therefore, it's crucial to validate the realism of an ensemble against specific in situ data for every application. This study aimed to develop an ensemble model describing methane consumption by soils and to test its efficacy on a randomly selected study site. \\nIn our research, we closely examined and replicated the algorithms of four soil methane consumption models: the modification by Glagolev, Filippov [2011] of Dörr et al. [1993], Curry's model [2007], the CH4 consumption block from the DLEM model [Tian et al., 2010], and the MeMo model excluding autochthonous CH4 sources [Murguia-Flores et al., 2018]. Using these, we developed an ensemble of four models. For experimental in situ data, we utilized field measurements from the Kursk region in Russia. Additionally, we introduced a method to average the ensemble model's prediction by assigning weight coefficients to each model. This approach acknowledges the idea that the total available information doubles every few years. Thus, newer models were given higher weights, while older ones received lower weights. \\nThe model ensemble effectively predicted CH4 consumption based on in situ measurements, albeit with a notably broad confidence interval for the predictions. Notably, there was minimal variance between the standard averaging of model predictions and weighted averaging. As anticipated, individual models underperformed compared to the ensemble. We computed the Theil inconsistency coefficient for various types of means, such as quadratic mean, cubic mean, and biquadratic mean, among others [Gini, Barbensi, 1958], both for ensemble modeling results and individual models. The ensemble predictions, when averaged using diverse methods, yielded Theil inconsistency coefficients ranging from 0.156 to 0.267. The most favorable outcome (0.156) was derived from the power mean with a power index of 0.7. However, the power mean presents a challenge as its power index isn't predetermined but chosen to best fit the experimental data. A similar limitation exists for the exponential mean. While the experimental data allows for the selection of a parameter yielding a Theil coefficient of 0.157, pre-determining this optimal value (1.3) is not feasible. Regarding other estimations that don't necessitate selecting optimal parameters, it was surprising to find that one of the best results (Theil's coefficient = 0.166) came from the half-sum of extreme terms. Surprisingly, the median provided a less satisfactory result, with a Theil's coefficient of 0.222. \\nThe merit of the ensemble approach stems from P.D. Thompson's 1977 observation, which he stated assertively: \\\"It is an indisputable fact that two or more inaccurate, but independent predictions of the same event can be combined in such a way that their \\\"combined\\\" forecast, on average, will be more accurate than any of these individual forecasts\\\" [Hagedorn et al., 2005]. Examining our ensemble of models through this lens reveals a limitation, as the condition of independence isn't fully satisfied. The models by Dörr et al. [1993], Curry [2007], and MeMo [Murguia-Flores et al., 2018] share underlying similarities and can be seen as part of a cohesive cluster. Only DLEM, crafted on entirely distinct principles, stands apart from these models. To enhance the ensemble's robustness in future iterations, the inclusion of genuinely independent models, such as a modified version of MDM [Zhuang et al., 2013] and the model by Ridgwell et al. [1999], is recommended. \\nThe ensemble, comprising four models and implemented without specific parameter adjustments, effectively captured methane consumption across diverse sites in the Kursk region, such as fields and forests. On average, the relative simulation error for all these sites was 36%, with the experimental data displaying a variation of 26%. Notably, while the variation is modest for this dataset, methane absorption measurements generally tend to fluctuate by several tens of percent [Crill, 1991, Fig. 1; Ambus, Robertson, 2006, Fig. 3; Kleptsova et al., 2010; Glagolev et al., 2012]. Considering this broader perspective, the simulation error achieved is indeed favorable. \\nUpon evaluating different methods for combining individual model results within the ensemble (specifically those methods that can be applied without prior parameter adjustments based on experimental data), it was found that the most straightforward operators yielded the best outcomes. This assessment was based on Theil's inequality coefficient criterion. Both the semi-sum of extreme terms and the arithmetic mean stood out in their performance. However, a significant drawback of the constructed ensemble is the extensive confidence interval for its predictions, averaging ±78% at a 90% probability level. We hypothesize that expanding the number of independent models within the ensemble could potentially narrow this interval.\",\"PeriodicalId\":336975,\"journal\":{\"name\":\"Environmental Dynamics and Global Climate Change\",\"volume\":\"109 7\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Dynamics and Global Climate Change\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18822/edgcc625761\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Dynamics and Global Climate Change","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18822/edgcc625761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

关于不需要选择最优参数的其他估计，令人惊讶的是，最佳结果之一（Theil's coefficient = 0.166）来自极值项的半和。令人惊讶的是，中位数的结果并不令人满意，Theil's 系数为 0.222。P.D. 汤普森在 1977 年发表的意见中明确指出了集合方法的优点：他断言："一个不争的事实是，对同一事件的两个或两个以上不准确但独立的预测，可以以这样一种方式组合起来，即它们的 "组合 "预测，平均而言，将比任何一个单独的预测都更准确"[Hagedorn 等人，2005]。从这个角度来审视我们的模型集合，会发现一个局限性，因为独立性条件并没有得到完全满足。Dörr等人[1993]、Curry[2007]和MeMo[Murguia-Flores等人，2018]的模型具有潜在的相似性，可以被视为一个有凝聚力的集群的一部分。只有 DLEM 是根据完全不同的原则设计的，与这些模型截然不同。为了增强集合在未来迭代中的稳健性，建议加入真正独立的模型，如 MDM 的改进版[庄等人，2013]和 Ridgwell 等人的模型[1999]。由四个模型组成的集合模型在实施时没有进行具体的参数调整，可有效捕捉库尔斯克地区不同地点（如田野和森林）的甲烷消耗情况。平均而言，所有这些地点的相对模拟误差为 36%，而实验数据显示的误差为 26%。值得注意的是，虽然该数据集的变化不大，但甲烷吸收测量值通常会有几十个百分点的波动[Crill，1991 年，图 1；Ambus，Robertson，2006 年，图 3；Kleptsova 等人，2010 年；Glagolev 等人，2012 年]。从这个更广阔的角度来看，所取得的模拟误差确实是有利的。在对组合内的单个模型结果的不同组合方法（特别是那些无需事先根据实验数据调整参数即可应用的方法）进行评估后，发现最直接的操作者能产生最好的结果。这一评估基于 Theil 的不平等系数标准。极端项半和与算术平均值的表现都很突出。然而，所构建的集合的一个显著缺点是其预测的置信区间过大，在 90% 的概率水平上平均为 ±78%。我们假设，扩大集合中独立模型的数量有可能缩小这一区间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-model ensemble successfully predicted atmospheric methane consumption in soils across the complex landscape

Methane consumption by soils is a crucial component of the CH4 and carbon cycle. It is essential to thoroughly investigate CH4 uptake by soils, particularly considering its anticipated increase by the end of the century [Zhuang et al., 2013]. Numerous mathematical models, both empirical and detailed biogeochemical [Glagolev et al., 2023], have been developed to quantify methane consumption by soils from the atmosphere. These models are instrumental in handling spatio-temporal variability and can offer reliable estimates of regional and global methane consumption by soils. Furthermore, they enhance our comprehension of the physical and biological processes that influence methanotrophy intensity. Consequently, we can forecast the response of CH4 consumption by soil to global climate shifts [Murguia-Flores et al., 2018], especially since many models consider the effects of atmospheric CH4 concentration changes on methanotrophy and ecosystem type [Zhuang et al., 2013]. In addition to the utilization of individual models, such as those cited by [Hagedorn et al., 2005; Glagolev et al., 2014; Ito et al., 2016; Silva et al., 2016], there has been extensive advancement in employing multiple models in an ensemble format. This approach aims to integrate as much a priori information as feasible [Lapko, 2002]. Throughout the 20th century, the concept of ensemble modeling evolved from merely drawing conclusions based on multiple independent experts (F. Sanders, 1963) to structured ensemble mathematical modeling [Hagedorn et al., 2005]. In this context, the term "ensemble" consistently refers to a collection containing more than one model. Complexities in describing the physiology and biochemistry of methanotrophic bacteria in natural environments [Bedard, Knowles, 1989; Hanson, Hanson, 1996; Belova et al., 2013; Oshkin et al., 2014] make it difficult to develop accurate biological models and determine their specific biokinetic parameters [Curry, 2007]. At the same time, broader and often empirical models, such as those by [Potter et al., 1996; Ridgwell et al., 1999; Curry, 2007; Murguia-Flores et al., 2018], demonstrate reasonable estimates of global methane consumption. Employing model ensembles could enhance accuracy, not just in global and large-scale modeling, but also at the granular level of local study sites. Nonetheless, ensemble modeling doesn't always ensure optimal outcomes, as all models within an ensemble might overlook a biological process or effect that significantly influences the dynamics of a real ecosystem [Ito et al., 2016]. For instance, no model considered anaerobic methane oxidation until this process was empirically identified [Xu et al., 2015]. Therefore, it's crucial to validate the realism of an ensemble against specific in situ data for every application. This study aimed to develop an ensemble model describing methane consumption by soils and to test its efficacy on a randomly selected study site. In our research, we closely examined and replicated the algorithms of four soil methane consumption models: the modification by Glagolev, Filippov [2011] of Dörr et al. [1993], Curry's model [2007], the CH4 consumption block from the DLEM model [Tian et al., 2010], and the MeMo model excluding autochthonous CH4 sources [Murguia-Flores et al., 2018]. Using these, we developed an ensemble of four models. For experimental in situ data, we utilized field measurements from the Kursk region in Russia. Additionally, we introduced a method to average the ensemble model's prediction by assigning weight coefficients to each model. This approach acknowledges the idea that the total available information doubles every few years. Thus, newer models were given higher weights, while older ones received lower weights. The model ensemble effectively predicted CH4 consumption based on in situ measurements, albeit with a notably broad confidence interval for the predictions. Notably, there was minimal variance between the standard averaging of model predictions and weighted averaging. As anticipated, individual models underperformed compared to the ensemble. We computed the Theil inconsistency coefficient for various types of means, such as quadratic mean, cubic mean, and biquadratic mean, among others [Gini, Barbensi, 1958], both for ensemble modeling results and individual models. The ensemble predictions, when averaged using diverse methods, yielded Theil inconsistency coefficients ranging from 0.156 to 0.267. The most favorable outcome (0.156) was derived from the power mean with a power index of 0.7. However, the power mean presents a challenge as its power index isn't predetermined but chosen to best fit the experimental data. A similar limitation exists for the exponential mean. While the experimental data allows for the selection of a parameter yielding a Theil coefficient of 0.157, pre-determining this optimal value (1.3) is not feasible. Regarding other estimations that don't necessitate selecting optimal parameters, it was surprising to find that one of the best results (Theil's coefficient = 0.166) came from the half-sum of extreme terms. Surprisingly, the median provided a less satisfactory result, with a Theil's coefficient of 0.222. The merit of the ensemble approach stems from P.D. Thompson's 1977 observation, which he stated assertively: "It is an indisputable fact that two or more inaccurate, but independent predictions of the same event can be combined in such a way that their "combined" forecast, on average, will be more accurate than any of these individual forecasts" [Hagedorn et al., 2005]. Examining our ensemble of models through this lens reveals a limitation, as the condition of independence isn't fully satisfied. The models by Dörr et al. [1993], Curry [2007], and MeMo [Murguia-Flores et al., 2018] share underlying similarities and can be seen as part of a cohesive cluster. Only DLEM, crafted on entirely distinct principles, stands apart from these models. To enhance the ensemble's robustness in future iterations, the inclusion of genuinely independent models, such as a modified version of MDM [Zhuang et al., 2013] and the model by Ridgwell et al. [1999], is recommended. The ensemble, comprising four models and implemented without specific parameter adjustments, effectively captured methane consumption across diverse sites in the Kursk region, such as fields and forests. On average, the relative simulation error for all these sites was 36%, with the experimental data displaying a variation of 26%. Notably, while the variation is modest for this dataset, methane absorption measurements generally tend to fluctuate by several tens of percent [Crill, 1991, Fig. 1; Ambus, Robertson, 2006, Fig. 3; Kleptsova et al., 2010; Glagolev et al., 2012]. Considering this broader perspective, the simulation error achieved is indeed favorable. Upon evaluating different methods for combining individual model results within the ensemble (specifically those methods that can be applied without prior parameter adjustments based on experimental data), it was found that the most straightforward operators yielded the best outcomes. This assessment was based on Theil's inequality coefficient criterion. Both the semi-sum of extreme terms and the arithmetic mean stood out in their performance. However, a significant drawback of the constructed ensemble is the extensive confidence interval for its predictions, averaging ±78% at a 90% probability level. We hypothesize that expanding the number of independent models within the ensemble could potentially narrow this interval.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Environmental Dynamics and Global Climate Change

自引率

0.00%

发文量