José Luis García-Soria , Miguel Morata , Katja Berger , Ana Belén Pascual-Venteo , Juan Pablo Rivera-Caicedo , Jochem Verrelst
{"title":"利用混合模型和成像光谱数据评估植被性状检索中的认识不确定性估计策略","authors":"José Luis García-Soria , Miguel Morata , Katja Berger , Ana Belén Pascual-Venteo , Juan Pablo Rivera-Caicedo , Jochem Verrelst","doi":"10.1016/j.rse.2024.114228","DOIUrl":null,"url":null,"abstract":"<div><p>The new-generation satellite imaging spectrometers provide an unprecedented data stream to be processed into quantifiable vegetation traits. Hybrid models have gained widespread acceptance in recent years due to their versatility in converting spectral data into traits. In hybrid models, the retrieval is obtained through a machine learning regression algorithm (MLRA) trained on a wide range of simulated data. For instance, they are currently under development for trait retrieval in preparation for the upcoming Copernicus Hyperspectral Imaging Mission for the Environment (CHIME), among others targeting routine estimation of canopy nitrogen content (CNC). However, like any retrieval algorithm, the process is not error-free, and most MLRAs inherently lack an uncertainty estimation related to the retrieved traits, which implies a risk of misinterpretation when applying the model to real-world data. Therefore, this study aimed to assess epistemic uncertainty estimation strategies (Bayesian method, drop-out, quantile regression, and bootstrapping) alongside the estimation of CNC using competitive MLRAs. Each of the regression models was evaluated using three data sets: (1) simulated scenes with varying noise using the SCOPE 2.1 radiative transfer model, (2) hyperspectral images from the PRISMA sensor, and (3) field-measured data. Analysis of generated uncertainty intervals led to the following findings: First, Gaussian processes regression (GPR) offers meaningful uncertainties, primarily attributable to spectral data degradation, which provide supplementary insights into the quality of trait mapping. Second, bootstrapping uncertainties can be used as quality indicators of the reliability of the estimates retrieved by hybrid models. Yet, its variability depends on the used MLRA, which impedes trusting its variance as a confidence interval. Third, quantile regression forest (QRF), despite not being top-performing algorithm, exhibit outstanding robustness estimations and uncertainty when the spectral data is degraded, either by Gaussian noise or by striping, often occurring in satellite imagery. Fourth, bootstrapped kernel ridge regression (KRR) demonstrated comparable performance to the benchmark algorithm GPR; the retrievals and uncertainties of these two MLRAs were highly correlated. Fifth, bootstrapped partial least squares regression (PLSR) estimations and uncertainties exhibit poor robustness to noise degradation, with normalized root mean square error (NRMSE) increasing from 19% to 112%. Additionally, a GUI tool was integrated into the ARTMO software package for assessing epistemic uncertainties from the embedded regression algorithms, providing a trait mapping quality indicator for mapping applications, and improving decision-making.</p></div>","PeriodicalId":417,"journal":{"name":"Remote Sensing of Environment","volume":null,"pages":null},"PeriodicalIF":11.1000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0034425724002463/pdfft?md5=d568747656a99eeda88c8639e7927b04&pid=1-s2.0-S0034425724002463-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Evaluating epistemic uncertainty estimation strategies in vegetation trait retrieval using hybrid models and imaging spectroscopy data\",\"authors\":\"José Luis García-Soria , Miguel Morata , Katja Berger , Ana Belén Pascual-Venteo , Juan Pablo Rivera-Caicedo , Jochem Verrelst\",\"doi\":\"10.1016/j.rse.2024.114228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The new-generation satellite imaging spectrometers provide an unprecedented data stream to be processed into quantifiable vegetation traits. Hybrid models have gained widespread acceptance in recent years due to their versatility in converting spectral data into traits. In hybrid models, the retrieval is obtained through a machine learning regression algorithm (MLRA) trained on a wide range of simulated data. For instance, they are currently under development for trait retrieval in preparation for the upcoming Copernicus Hyperspectral Imaging Mission for the Environment (CHIME), among others targeting routine estimation of canopy nitrogen content (CNC). However, like any retrieval algorithm, the process is not error-free, and most MLRAs inherently lack an uncertainty estimation related to the retrieved traits, which implies a risk of misinterpretation when applying the model to real-world data. Therefore, this study aimed to assess epistemic uncertainty estimation strategies (Bayesian method, drop-out, quantile regression, and bootstrapping) alongside the estimation of CNC using competitive MLRAs. Each of the regression models was evaluated using three data sets: (1) simulated scenes with varying noise using the SCOPE 2.1 radiative transfer model, (2) hyperspectral images from the PRISMA sensor, and (3) field-measured data. Analysis of generated uncertainty intervals led to the following findings: First, Gaussian processes regression (GPR) offers meaningful uncertainties, primarily attributable to spectral data degradation, which provide supplementary insights into the quality of trait mapping. Second, bootstrapping uncertainties can be used as quality indicators of the reliability of the estimates retrieved by hybrid models. Yet, its variability depends on the used MLRA, which impedes trusting its variance as a confidence interval. Third, quantile regression forest (QRF), despite not being top-performing algorithm, exhibit outstanding robustness estimations and uncertainty when the spectral data is degraded, either by Gaussian noise or by striping, often occurring in satellite imagery. Fourth, bootstrapped kernel ridge regression (KRR) demonstrated comparable performance to the benchmark algorithm GPR; the retrievals and uncertainties of these two MLRAs were highly correlated. Fifth, bootstrapped partial least squares regression (PLSR) estimations and uncertainties exhibit poor robustness to noise degradation, with normalized root mean square error (NRMSE) increasing from 19% to 112%. Additionally, a GUI tool was integrated into the ARTMO software package for assessing epistemic uncertainties from the embedded regression algorithms, providing a trait mapping quality indicator for mapping applications, and improving decision-making.</p></div>\",\"PeriodicalId\":417,\"journal\":{\"name\":\"Remote Sensing of Environment\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2024-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0034425724002463/pdfft?md5=d568747656a99eeda88c8639e7927b04&pid=1-s2.0-S0034425724002463-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Remote Sensing of Environment\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0034425724002463\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing of Environment","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0034425724002463","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Evaluating epistemic uncertainty estimation strategies in vegetation trait retrieval using hybrid models and imaging spectroscopy data
The new-generation satellite imaging spectrometers provide an unprecedented data stream to be processed into quantifiable vegetation traits. Hybrid models have gained widespread acceptance in recent years due to their versatility in converting spectral data into traits. In hybrid models, the retrieval is obtained through a machine learning regression algorithm (MLRA) trained on a wide range of simulated data. For instance, they are currently under development for trait retrieval in preparation for the upcoming Copernicus Hyperspectral Imaging Mission for the Environment (CHIME), among others targeting routine estimation of canopy nitrogen content (CNC). However, like any retrieval algorithm, the process is not error-free, and most MLRAs inherently lack an uncertainty estimation related to the retrieved traits, which implies a risk of misinterpretation when applying the model to real-world data. Therefore, this study aimed to assess epistemic uncertainty estimation strategies (Bayesian method, drop-out, quantile regression, and bootstrapping) alongside the estimation of CNC using competitive MLRAs. Each of the regression models was evaluated using three data sets: (1) simulated scenes with varying noise using the SCOPE 2.1 radiative transfer model, (2) hyperspectral images from the PRISMA sensor, and (3) field-measured data. Analysis of generated uncertainty intervals led to the following findings: First, Gaussian processes regression (GPR) offers meaningful uncertainties, primarily attributable to spectral data degradation, which provide supplementary insights into the quality of trait mapping. Second, bootstrapping uncertainties can be used as quality indicators of the reliability of the estimates retrieved by hybrid models. Yet, its variability depends on the used MLRA, which impedes trusting its variance as a confidence interval. Third, quantile regression forest (QRF), despite not being top-performing algorithm, exhibit outstanding robustness estimations and uncertainty when the spectral data is degraded, either by Gaussian noise or by striping, often occurring in satellite imagery. Fourth, bootstrapped kernel ridge regression (KRR) demonstrated comparable performance to the benchmark algorithm GPR; the retrievals and uncertainties of these two MLRAs were highly correlated. Fifth, bootstrapped partial least squares regression (PLSR) estimations and uncertainties exhibit poor robustness to noise degradation, with normalized root mean square error (NRMSE) increasing from 19% to 112%. Additionally, a GUI tool was integrated into the ARTMO software package for assessing epistemic uncertainties from the embedded regression algorithms, providing a trait mapping quality indicator for mapping applications, and improving decision-making.
期刊介绍:
Remote Sensing of Environment (RSE) serves the Earth observation community by disseminating results on the theory, science, applications, and technology that contribute to advancing the field of remote sensing. With a thoroughly interdisciplinary approach, RSE encompasses terrestrial, oceanic, and atmospheric sensing.
The journal emphasizes biophysical and quantitative approaches to remote sensing at local to global scales, covering a diverse range of applications and techniques.
RSE serves as a vital platform for the exchange of knowledge and advancements in the dynamic field of remote sensing.