探索卫星图像时间序列分类中领域自适应的数据量需求

2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp) Pub Date : 2019-08-01 DOI:10.1109/Multi-Temp.2019.8866898

Benjamin Lucas, Charlotte Pelletier, J. Inglada, Daniel F. Schmidt, Geoffrey I. Webb, F. Petitjean

{"title":"探索卫星图像时间序列分类中领域自适应的数据量需求","authors":"Benjamin Lucas, Charlotte Pelletier, J. Inglada, Daniel F. Schmidt, Geoffrey I. Webb, F. Petitjean","doi":"10.1109/Multi-Temp.2019.8866898","DOIUrl":null,"url":null,"abstract":"Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-the-art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only—88.9% versus 84.7%.","PeriodicalId":106790,"journal":{"name":"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Exploring Data Quantity Requirements for Domain Adaptation in the Classification of Satellite Image Time Series\",\"authors\":\"Benjamin Lucas, Charlotte Pelletier, J. Inglada, Daniel F. Schmidt, Geoffrey I. Webb, F. Petitjean\",\"doi\":\"10.1109/Multi-Temp.2019.8866898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-the-art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only—88.9% versus 84.7%.\",\"PeriodicalId\":106790,\"journal\":{\"name\":\"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/Multi-Temp.2019.8866898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Multi-Temp.2019.8866898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

在所有类型的环境研究和管理中，土地覆盖图是一个重要的输入变量。然而，用于创建它们的现代最先进的机器学习技术需要大量的训练数据来产生最佳的准确性。领域适应是研究人员在标记训练数据不可用或稀缺时可能使用的一种技术。本文着眼于在数据可用的区域(源域)上训练卷积神经网络模型的结果，然后通过在可用的标记数据上重新训练该模型来将其适应于另一个区域(目标域)，特别是这些结果如何随着数据可用性的增加而变化。我们对卫星图像时间序列进行了域自适应实验，得出了三个有趣的结论:(1)仅在源域数据上训练的模型在目标域上的测试准确率为73.0%;(2)当对目标数据重新训练所有权重时，需要超过16,000个实例来提高纯源模型的准确性;(3)即使在目标域中有足够的数据可用，使用源域中预训练的模型与在目标域数据上训练的模型相比，也会产生更好的整体测试准确性——88.9%对84.7%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploring Data Quantity Requirements for Domain Adaptation in the Classification of Satellite Image Time Series

Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-the-art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only—88.9% versus 84.7%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)

自引率

0.00%

发文量