Benjamin Lucas, Charlotte Pelletier, J. Inglada, Daniel F. Schmidt, Geoffrey I. Webb, F. Petitjean
{"title":"探索卫星图像时间序列分类中领域自适应的数据量需求","authors":"Benjamin Lucas, Charlotte Pelletier, J. Inglada, Daniel F. Schmidt, Geoffrey I. Webb, F. Petitjean","doi":"10.1109/Multi-Temp.2019.8866898","DOIUrl":null,"url":null,"abstract":"Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-the-art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only—88.9% versus 84.7%.","PeriodicalId":106790,"journal":{"name":"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Exploring Data Quantity Requirements for Domain Adaptation in the Classification of Satellite Image Time Series\",\"authors\":\"Benjamin Lucas, Charlotte Pelletier, J. Inglada, Daniel F. Schmidt, Geoffrey I. Webb, F. Petitjean\",\"doi\":\"10.1109/Multi-Temp.2019.8866898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-the-art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only—88.9% versus 84.7%.\",\"PeriodicalId\":106790,\"journal\":{\"name\":\"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/Multi-Temp.2019.8866898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Multi-Temp.2019.8866898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploring Data Quantity Requirements for Domain Adaptation in the Classification of Satellite Image Time Series
Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-the-art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only—88.9% versus 84.7%.