Denis Valle , Leo Haneda , Rafael Izbicki , Renan Akio Kamimura , Bruna Pereira de Azevedo , Silvio H.M. Gomes , Arthur Sanchez , Carlos A. Silva , Danilo R.A. Almeida
{"title":"Nonparametric quantification of uncertainty in multistep upscaling approaches: A case study on estimating forest biomass in the Brazilian Amazon","authors":"Denis Valle , Leo Haneda , Rafael Izbicki , Renan Akio Kamimura , Bruna Pereira de Azevedo , Silvio H.M. Gomes , Arthur Sanchez , Carlos A. Silva , Danilo R.A. Almeida","doi":"10.1016/j.srs.2024.100180","DOIUrl":null,"url":null,"abstract":"<div><div>The use of multistep upscaling approaches in which field data are combined with data from multiple remote sensors that operate at different spatial scales (e.g., UAV LiDAR, GEDI, and Landsat) is becoming increasingly popular. In these approaches, a series of models are fitted linking the information from these different sensors, often resulting in improved predictions over large areas. Quantifying the uncertainty associated with individual models can be challenging as these models may not generate uncertainty estimates (e.g., machine learning models such as random forest), a problem that is further exacerbated if the results from multiple models are combined within a multistep upscaling methodology. In this article, we describe a nonparametric conformal approach to quantify uncertainty. This approach is straight-forward to apply, is computationally inexpensive (differently from bootstrapping), and generates improved predictive intervals. Importantly, this methodology can be used regardless of the number of models adopted in the upscaling approach and the nature of the intermediate models, as long as the final model can generate predictive intervals. We illustrate the improved empirical coverage of the conformalized predictive intervals using simulated data for a two-step upscaling scenario involving field, UAV LiDAR, and Landsat data. This simulation exercise shows how increasing uncertainty in the first stage model (which relates biomass field data to UAV LiDAR data) leads to an increase in the severity of uncertainty underestimation by naïve predictive intervals. On the other hand, conformalized predictive intervals do not exhibit this shortcoming. Finally, we illustrate uncertainty quantification for a multistep upscaling methodology using data from a large-scale carbon project in the Brazilian Amazon. Our validation exercise using these empirical data confirms the improved performance of the conformalized predictive intervals. We expect that the conformal approach described here will be key for uncertainty quantification as multistep upscaling approaches become increasingly more common.</div></div>","PeriodicalId":101147,"journal":{"name":"Science of Remote Sensing","volume":"11 ","pages":"Article 100180"},"PeriodicalIF":5.7000,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666017224000646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The use of multistep upscaling approaches in which field data are combined with data from multiple remote sensors that operate at different spatial scales (e.g., UAV LiDAR, GEDI, and Landsat) is becoming increasingly popular. In these approaches, a series of models are fitted linking the information from these different sensors, often resulting in improved predictions over large areas. Quantifying the uncertainty associated with individual models can be challenging as these models may not generate uncertainty estimates (e.g., machine learning models such as random forest), a problem that is further exacerbated if the results from multiple models are combined within a multistep upscaling methodology. In this article, we describe a nonparametric conformal approach to quantify uncertainty. This approach is straight-forward to apply, is computationally inexpensive (differently from bootstrapping), and generates improved predictive intervals. Importantly, this methodology can be used regardless of the number of models adopted in the upscaling approach and the nature of the intermediate models, as long as the final model can generate predictive intervals. We illustrate the improved empirical coverage of the conformalized predictive intervals using simulated data for a two-step upscaling scenario involving field, UAV LiDAR, and Landsat data. This simulation exercise shows how increasing uncertainty in the first stage model (which relates biomass field data to UAV LiDAR data) leads to an increase in the severity of uncertainty underestimation by naïve predictive intervals. On the other hand, conformalized predictive intervals do not exhibit this shortcoming. Finally, we illustrate uncertainty quantification for a multistep upscaling methodology using data from a large-scale carbon project in the Brazilian Amazon. Our validation exercise using these empirical data confirms the improved performance of the conformalized predictive intervals. We expect that the conformal approach described here will be key for uncertainty quantification as multistep upscaling approaches become increasingly more common.