Nonparametric quantification of uncertainty in multistep upscaling approaches: A case study on estimating forest biomass in the Brazilian Amazon

IF 5.7 Q1 ENVIRONMENTAL SCIENCES

Science of Remote Sensing Pub Date : 2024-12-07 DOI:10.1016/j.srs.2024.100180

Denis Valle , Leo Haneda , Rafael Izbicki , Renan Akio Kamimura , Bruna Pereira de Azevedo , Silvio H.M. Gomes , Arthur Sanchez , Carlos A. Silva , Danilo R.A. Almeida

{"title":"Nonparametric quantification of uncertainty in multistep upscaling approaches: A case study on estimating forest biomass in the Brazilian Amazon","authors":"Denis Valle , Leo Haneda , Rafael Izbicki , Renan Akio Kamimura , Bruna Pereira de Azevedo , Silvio H.M. Gomes , Arthur Sanchez , Carlos A. Silva , Danilo R.A. Almeida","doi":"10.1016/j.srs.2024.100180","DOIUrl":null,"url":null,"abstract":"<div><div>The use of multistep upscaling approaches in which field data are combined with data from multiple remote sensors that operate at different spatial scales (e.g., UAV LiDAR, GEDI, and Landsat) is becoming increasingly popular. In these approaches, a series of models are fitted linking the information from these different sensors, often resulting in improved predictions over large areas. Quantifying the uncertainty associated with individual models can be challenging as these models may not generate uncertainty estimates (e.g., machine learning models such as random forest), a problem that is further exacerbated if the results from multiple models are combined within a multistep upscaling methodology. In this article, we describe a nonparametric conformal approach to quantify uncertainty. This approach is straight-forward to apply, is computationally inexpensive (differently from bootstrapping), and generates improved predictive intervals. Importantly, this methodology can be used regardless of the number of models adopted in the upscaling approach and the nature of the intermediate models, as long as the final model can generate predictive intervals. We illustrate the improved empirical coverage of the conformalized predictive intervals using simulated data for a two-step upscaling scenario involving field, UAV LiDAR, and Landsat data. This simulation exercise shows how increasing uncertainty in the first stage model (which relates biomass field data to UAV LiDAR data) leads to an increase in the severity of uncertainty underestimation by naïve predictive intervals. On the other hand, conformalized predictive intervals do not exhibit this shortcoming. Finally, we illustrate uncertainty quantification for a multistep upscaling methodology using data from a large-scale carbon project in the Brazilian Amazon. Our validation exercise using these empirical data confirms the improved performance of the conformalized predictive intervals. We expect that the conformal approach described here will be key for uncertainty quantification as multistep upscaling approaches become increasingly more common.</div></div>","PeriodicalId":101147,"journal":{"name":"Science of Remote Sensing","volume":"11 ","pages":"Article 100180"},"PeriodicalIF":5.7000,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666017224000646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

The use of multistep upscaling approaches in which field data are combined with data from multiple remote sensors that operate at different spatial scales (e.g., UAV LiDAR, GEDI, and Landsat) is becoming increasingly popular. In these approaches, a series of models are fitted linking the information from these different sensors, often resulting in improved predictions over large areas. Quantifying the uncertainty associated with individual models can be challenging as these models may not generate uncertainty estimates (e.g., machine learning models such as random forest), a problem that is further exacerbated if the results from multiple models are combined within a multistep upscaling methodology. In this article, we describe a nonparametric conformal approach to quantify uncertainty. This approach is straight-forward to apply, is computationally inexpensive (differently from bootstrapping), and generates improved predictive intervals. Importantly, this methodology can be used regardless of the number of models adopted in the upscaling approach and the nature of the intermediate models, as long as the final model can generate predictive intervals. We illustrate the improved empirical coverage of the conformalized predictive intervals using simulated data for a two-step upscaling scenario involving field, UAV LiDAR, and Landsat data. This simulation exercise shows how increasing uncertainty in the first stage model (which relates biomass field data to UAV LiDAR data) leads to an increase in the severity of uncertainty underestimation by naïve predictive intervals. On the other hand, conformalized predictive intervals do not exhibit this shortcoming. Finally, we illustrate uncertainty quantification for a multistep upscaling methodology using data from a large-scale carbon project in the Brazilian Amazon. Our validation exercise using these empirical data confirms the improved performance of the conformalized predictive intervals. We expect that the conformal approach described here will be key for uncertainty quantification as multistep upscaling approaches become increasingly more common.

查看原文本刊更多论文

多步升级方法中不确定性的非参数量化：以巴西亚马逊森林生物量估算为例

使用多步升级方法，将现场数据与来自不同空间尺度的多个远程传感器（例如无人机LiDAR、GEDI和Landsat）的数据相结合，正变得越来越流行。在这些方法中，将来自这些不同传感器的信息连接在一起的一系列模型被拟合，通常会导致对大区域的改进预测。量化与单个模型相关的不确定性可能具有挑战性，因为这些模型可能不会产生不确定性估计（例如，随机森林等机器学习模型），如果将多个模型的结果结合在一个多步骤升级方法中，这个问题会进一步加剧。在本文中，我们描述了一种量化不确定性的非参数保形方法。这种方法可以直接应用，计算成本低（与自举不同），并生成改进的预测区间。重要的是，只要最终模型能够生成预测区间，无论升级方法中采用的模型数量和中间模型的性质如何，该方法都可以使用。我们使用模拟数据说明了在涉及野外、无人机激光雷达和陆地卫星数据的两步升级场景中改进的保形预测区间的经验覆盖范围。该模拟练习显示了第一阶段模型（将生物质场数据与无人机激光雷达数据联系起来）中不确定性的增加如何导致naïve预测区间不确定性低估的严重程度的增加。另一方面，符合化预测区间不表现出这一缺点。最后，我们利用巴西亚马逊地区一个大型碳项目的数据说明了多步骤升级方法的不确定性量化。我们使用这些经验数据的验证练习证实了符合化预测区间的改进性能。随着多步骤升级方法变得越来越普遍，我们期望这里描述的保形方法将成为不确定性量化的关键。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Science of Remote Sensing

CiteScore

12.20

自引率

0.00%

发文量