空间自相关训练和验证样本膨胀了卷积神经网络的性能评估

ISPRS Open Journal of Photogrammetry and Remote Sensing Pub Date : 2022-08-01 DOI:10.1016/j.ophoto.2022.100018

Teja Kattenborn , Felix Schiefer , Julian Frey , Hannes Feilhauer , Miguel D. Mahecha , Carsten F. Dormann

{"title":"空间自相关训练和验证样本膨胀了卷积神经网络的性能评估","authors":"Teja Kattenborn , Felix Schiefer , Julian Frey , Hannes Feilhauer , Miguel D. Mahecha , Carsten F. Dormann","doi":"10.1016/j.ophoto.2022.100018","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning and particularly Convolutional Neural Networks (CNN) in concert with remote sensing are becoming standard analytical tools in the geosciences. A series of studies has presented the seemingly outstanding performance of CNN for predictive modelling. However, the predictive performance of such models is commonly estimated using random cross-validation, which does not account for spatial autocorrelation between training and validation data. Independent of the analytical method, such spatial dependence will inevitably inflate the estimated model performance. This problem is ignored in most CNN-related studies and suggests a flaw in their validation procedure. Here, we demonstrate how neglecting spatial autocorrelation during cross-validation leads to an optimistic model performance assessment, using the example of a tree species segmentation problem in multiple, spatially distributed drone image acquisitions. We evaluated CNN-based predictions with test data sampled from 1) randomly sampled hold-outs and 2) spatially blocked hold-outs. Assuming that a block cross-validation provides a realistic model performance, a validation with randomly sampled holdouts overestimated the model performance by up to 28%. Smaller training sample size increased this optimism. Spatial autocorrelation among observations was significantly higher within than between different remote sensing acquisitions. Thus, model performance should be tested with spatial cross-validation strategies and multiple independent remote sensing acquisitions. Otherwise, the estimated performance of any geospatial deep learning method is likely to be overestimated.</p></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"5 ","pages":"Article 100018"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667393222000072/pdfft?md5=112b47be1e0715b227d1a39209c56b78&pid=1-s2.0-S2667393222000072-main.pdf","citationCount":"26","resultStr":"{\"title\":\"Spatially autocorrelated training and validation samples inflate performance assessment of convolutional neural networks\",\"authors\":\"Teja Kattenborn , Felix Schiefer , Julian Frey , Hannes Feilhauer , Miguel D. Mahecha , Carsten F. Dormann\",\"doi\":\"10.1016/j.ophoto.2022.100018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Deep learning and particularly Convolutional Neural Networks (CNN) in concert with remote sensing are becoming standard analytical tools in the geosciences. A series of studies has presented the seemingly outstanding performance of CNN for predictive modelling. However, the predictive performance of such models is commonly estimated using random cross-validation, which does not account for spatial autocorrelation between training and validation data. Independent of the analytical method, such spatial dependence will inevitably inflate the estimated model performance. This problem is ignored in most CNN-related studies and suggests a flaw in their validation procedure. Here, we demonstrate how neglecting spatial autocorrelation during cross-validation leads to an optimistic model performance assessment, using the example of a tree species segmentation problem in multiple, spatially distributed drone image acquisitions. We evaluated CNN-based predictions with test data sampled from 1) randomly sampled hold-outs and 2) spatially blocked hold-outs. Assuming that a block cross-validation provides a realistic model performance, a validation with randomly sampled holdouts overestimated the model performance by up to 28%. Smaller training sample size increased this optimism. Spatial autocorrelation among observations was significantly higher within than between different remote sensing acquisitions. Thus, model performance should be tested with spatial cross-validation strategies and multiple independent remote sensing acquisitions. Otherwise, the estimated performance of any geospatial deep learning method is likely to be overestimated.</p></div>\",\"PeriodicalId\":100730,\"journal\":{\"name\":\"ISPRS Open Journal of Photogrammetry and Remote Sensing\",\"volume\":\"5 \",\"pages\":\"Article 100018\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667393222000072/pdfft?md5=112b47be1e0715b227d1a39209c56b78&pid=1-s2.0-S2667393222000072-main.pdf\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Open Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667393222000072\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Open Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667393222000072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

摘要

深度学习，特别是卷积神经网络(CNN)与遥感相结合，正在成为地球科学的标准分析工具。一系列研究表明，CNN在预测建模方面的表现似乎非常出色。然而，这些模型的预测性能通常是使用随机交叉验证来估计的，这没有考虑到训练数据和验证数据之间的空间自相关性。与分析方法无关，这种空间依赖性将不可避免地使估计的模型性能膨胀。这个问题在大多数与cnn相关的研究中被忽略了，这表明他们的验证程序存在缺陷。在这里，我们展示了在交叉验证过程中忽略空间自相关如何导致乐观的模型性能评估，使用多个空间分布的无人机图像采集中的树种分割问题的示例。我们使用从1)随机抽样的顽固分子和2)空间阻塞的顽固分子中抽样的测试数据来评估基于cnn的预测。假设块交叉验证提供了一个真实的模型性能，那么随机抽样的验证将模型性能高估了28%。较小的训练样本量增加了这种乐观情绪。不同遥感数据间的空间自相关性显著高于不同遥感数据间的空间自相关性。因此，模型的性能应该通过空间交叉验证策略和多个独立的遥感采集来测试。否则，任何地理空间深度学习方法的估计性能都可能被高估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Spatially autocorrelated training and validation samples inflate performance assessment of convolutional neural networks

Deep learning and particularly Convolutional Neural Networks (CNN) in concert with remote sensing are becoming standard analytical tools in the geosciences. A series of studies has presented the seemingly outstanding performance of CNN for predictive modelling. However, the predictive performance of such models is commonly estimated using random cross-validation, which does not account for spatial autocorrelation between training and validation data. Independent of the analytical method, such spatial dependence will inevitably inflate the estimated model performance. This problem is ignored in most CNN-related studies and suggests a flaw in their validation procedure. Here, we demonstrate how neglecting spatial autocorrelation during cross-validation leads to an optimistic model performance assessment, using the example of a tree species segmentation problem in multiple, spatially distributed drone image acquisitions. We evaluated CNN-based predictions with test data sampled from 1) randomly sampled hold-outs and 2) spatially blocked hold-outs. Assuming that a block cross-validation provides a realistic model performance, a validation with randomly sampled holdouts overestimated the model performance by up to 28%. Smaller training sample size increased this optimism. Spatial autocorrelation among observations was significantly higher within than between different remote sensing acquisitions. Thus, model performance should be tested with spatial cross-validation strategies and multiple independent remote sensing acquisitions. Otherwise, the estimated performance of any geospatial deep learning method is likely to be overestimated.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ISPRS Open Journal of Photogrammetry and Remote Sensing

CiteScore

5.10

自引率

0.00%

发文量