从高光谱数据中检索植物性状:科学数据管理的集体努力优于来自PROSAIL模型的模拟数据

Daniel Mederer , Hannes Feilhauer , Eya Cherif , Katja Berger , Tobias B. Hank , Kyle R. Kovach , Phuong D. Dao , Bing Lu , Philip A. Townsend , Teja Kattenborn
{"title":"从高光谱数据中检索植物性状:科学数据管理的集体努力优于来自PROSAIL模型的模拟数据","authors":"Daniel Mederer ,&nbsp;Hannes Feilhauer ,&nbsp;Eya Cherif ,&nbsp;Katja Berger ,&nbsp;Tobias B. Hank ,&nbsp;Kyle R. Kovach ,&nbsp;Phuong D. Dao ,&nbsp;Bing Lu ,&nbsp;Philip A. Townsend ,&nbsp;Teja Kattenborn","doi":"10.1016/j.ophoto.2024.100080","DOIUrl":null,"url":null,"abstract":"<div><div>Plant traits play a pivotal role in steering ecosystem dynamics. As plant canopies have evolved to interact with light, spectral data convey information on a variety of plant traits. Machine learning techniques have been used successfully to retrieve diverse traits from hyperspectral data. Nonetheless, the efficacy of machine learning is restricted by limited access to high-quality reference data for training. Previous studies showed that aggregating data across domains, sensors, or growth forms provided by collaborative efforts of the scientific community enables the creation of transferable models. However, even such curated databases are still sparse for several traits. To address these challenges, we investigated the potential of filling such data gaps with simulated hyperspectral data generated through the most widely-used radiative transfer model (RTM) PROSAIL. We coupled trait information from the TRY plant trait database with information on plant communities from the sPlot database, to build a realistic input trait dataset for the RTM-based simulation of canopy spectra. Our findings indicate that simulated data can alleviate the effects of data scarcity for highly underrepresented traits. In most other cases, however, the effects of including simulated data from RTMs are negligible or even negative. While more complex RTM models promise further improvements, their parameterization remains challenging. This highlights two key observations: firstly, RTM models, such as PROSAIL, exhibit limitations in producing realistic spectra across diverse ecosystems; secondly, real-world data repurposed from various sources exhibit superior retrieval success compared to simulated data. As a result, we advocate to emphasize the importance of active data sharing over secrecy and overreliance on modeling to address data limitations.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"15 ","pages":"Article 100080"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Plant trait retrieval from hyperspectral data: Collective efforts in scientific data curation outperform simulated data derived from the PROSAIL model\",\"authors\":\"Daniel Mederer ,&nbsp;Hannes Feilhauer ,&nbsp;Eya Cherif ,&nbsp;Katja Berger ,&nbsp;Tobias B. Hank ,&nbsp;Kyle R. Kovach ,&nbsp;Phuong D. Dao ,&nbsp;Bing Lu ,&nbsp;Philip A. Townsend ,&nbsp;Teja Kattenborn\",\"doi\":\"10.1016/j.ophoto.2024.100080\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Plant traits play a pivotal role in steering ecosystem dynamics. As plant canopies have evolved to interact with light, spectral data convey information on a variety of plant traits. Machine learning techniques have been used successfully to retrieve diverse traits from hyperspectral data. Nonetheless, the efficacy of machine learning is restricted by limited access to high-quality reference data for training. Previous studies showed that aggregating data across domains, sensors, or growth forms provided by collaborative efforts of the scientific community enables the creation of transferable models. However, even such curated databases are still sparse for several traits. To address these challenges, we investigated the potential of filling such data gaps with simulated hyperspectral data generated through the most widely-used radiative transfer model (RTM) PROSAIL. We coupled trait information from the TRY plant trait database with information on plant communities from the sPlot database, to build a realistic input trait dataset for the RTM-based simulation of canopy spectra. Our findings indicate that simulated data can alleviate the effects of data scarcity for highly underrepresented traits. In most other cases, however, the effects of including simulated data from RTMs are negligible or even negative. While more complex RTM models promise further improvements, their parameterization remains challenging. This highlights two key observations: firstly, RTM models, such as PROSAIL, exhibit limitations in producing realistic spectra across diverse ecosystems; secondly, real-world data repurposed from various sources exhibit superior retrieval success compared to simulated data. As a result, we advocate to emphasize the importance of active data sharing over secrecy and overreliance on modeling to address data limitations.</div></div>\",\"PeriodicalId\":100730,\"journal\":{\"name\":\"ISPRS Open Journal of Photogrammetry and Remote Sensing\",\"volume\":\"15 \",\"pages\":\"Article 100080\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Open Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667393224000243\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Open Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667393224000243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

植物性状在生态系统动力学中起着关键性作用。由于植物冠层已经进化到与光相互作用,光谱数据传达了各种植物性状的信息。机器学习技术已经成功地用于从高光谱数据中检索各种特征。然而,机器学习的有效性受到高质量训练参考数据的限制。先前的研究表明,通过科学界的合作努力,跨领域、传感器或增长形式的数据聚合可以创建可转移的模型。然而,即使是这样精心整理的数据库,在一些特征上仍然是稀疏的。为了解决这些挑战,我们研究了通过最广泛使用的辐射传输模型(RTM) PROSAIL生成的模拟高光谱数据来填补这些数据空白的可能性。将TRY植物性状数据库中的性状信息与sPlot数据库中的植物群落信息进行耦合,为基于rtm的冠层光谱模拟构建真实的输入性状数据集。我们的研究结果表明,模拟数据可以缓解数据稀缺性对高度代表性不足的特征的影响。然而,在大多数其他情况下,包括来自rtm的模拟数据的影响可以忽略不计,甚至是负面的。虽然更复杂的RTM模型有望进一步改进,但它们的参数化仍然具有挑战性。这突出了两个关键的观察结果:首先,RTM模型,如PROSAIL,在产生不同生态系统的真实光谱方面存在局限性;其次,与模拟数据相比,从各种来源重新利用的真实数据显示出更高的检索成功率。因此,我们主张强调主动数据共享的重要性,而不是保密和过度依赖建模来解决数据限制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Plant trait retrieval from hyperspectral data: Collective efforts in scientific data curation outperform simulated data derived from the PROSAIL model
Plant traits play a pivotal role in steering ecosystem dynamics. As plant canopies have evolved to interact with light, spectral data convey information on a variety of plant traits. Machine learning techniques have been used successfully to retrieve diverse traits from hyperspectral data. Nonetheless, the efficacy of machine learning is restricted by limited access to high-quality reference data for training. Previous studies showed that aggregating data across domains, sensors, or growth forms provided by collaborative efforts of the scientific community enables the creation of transferable models. However, even such curated databases are still sparse for several traits. To address these challenges, we investigated the potential of filling such data gaps with simulated hyperspectral data generated through the most widely-used radiative transfer model (RTM) PROSAIL. We coupled trait information from the TRY plant trait database with information on plant communities from the sPlot database, to build a realistic input trait dataset for the RTM-based simulation of canopy spectra. Our findings indicate that simulated data can alleviate the effects of data scarcity for highly underrepresented traits. In most other cases, however, the effects of including simulated data from RTMs are negligible or even negative. While more complex RTM models promise further improvements, their parameterization remains challenging. This highlights two key observations: firstly, RTM models, such as PROSAIL, exhibit limitations in producing realistic spectra across diverse ecosystems; secondly, real-world data repurposed from various sources exhibit superior retrieval success compared to simulated data. As a result, we advocate to emphasize the importance of active data sharing over secrecy and overreliance on modeling to address data limitations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信