{"title":"RS-DeepSuperLearner:融合CNN集成的遥感场景分类","authors":"H. Alhichri","doi":"10.1080/19475683.2023.2165544","DOIUrl":null,"url":null,"abstract":"ABSTRACT Scene classification is an important problem in remote sensing (RS) and has attracted a lot of research in the past decade. Nowadays, most proposed methods are based on deep convolutional neural network (CNN) models, and many pretrained CNN models have been investigated. Ensemble techniques are well studied in the machine learning community; however, few works have used them in RS scene classification. In this work, we propose an ensemble approach, called RS-DeepSuperLearner, that fuses the outputs of five advanced CNN models, namely, VGG16, Inception-V3, DenseNet121, InceptionResNet-V2, and EfficientNet-B3. First, we improve the architecture of the five CNN models by attaching an auxiliary branch at specific layer locations. In other words, the models now have two output layers producing predictions each and the final prediction is the average of the two. The RS-DeepSuperLearner method starts by fine-tuning the five CNN models using the training data. Then, it employs a deep neural network (DNN) SuperLearner to learn the best way for fusing the outputs of the five CNN models by training it on the predicted probability outputs and the cross-validation accuracies (per class) of the individual models. The proposed methodology was assessed on six publicly available RS datasets: UC Merced, KSA, RSSCN7, Optimal31, AID, and NWPU-RSC45. The experimental results demonstrate its superior capabilities when compared to state-of-the-art methods in the literature.","PeriodicalId":46270,"journal":{"name":"Annals of GIS","volume":"9 1","pages":"121 - 142"},"PeriodicalIF":2.7000,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"RS-DeepSuperLearner: fusion of CNN ensemble for remote sensing scene classification\",\"authors\":\"H. Alhichri\",\"doi\":\"10.1080/19475683.2023.2165544\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT Scene classification is an important problem in remote sensing (RS) and has attracted a lot of research in the past decade. Nowadays, most proposed methods are based on deep convolutional neural network (CNN) models, and many pretrained CNN models have been investigated. Ensemble techniques are well studied in the machine learning community; however, few works have used them in RS scene classification. In this work, we propose an ensemble approach, called RS-DeepSuperLearner, that fuses the outputs of five advanced CNN models, namely, VGG16, Inception-V3, DenseNet121, InceptionResNet-V2, and EfficientNet-B3. First, we improve the architecture of the five CNN models by attaching an auxiliary branch at specific layer locations. In other words, the models now have two output layers producing predictions each and the final prediction is the average of the two. The RS-DeepSuperLearner method starts by fine-tuning the five CNN models using the training data. Then, it employs a deep neural network (DNN) SuperLearner to learn the best way for fusing the outputs of the five CNN models by training it on the predicted probability outputs and the cross-validation accuracies (per class) of the individual models. The proposed methodology was assessed on six publicly available RS datasets: UC Merced, KSA, RSSCN7, Optimal31, AID, and NWPU-RSC45. The experimental results demonstrate its superior capabilities when compared to state-of-the-art methods in the literature.\",\"PeriodicalId\":46270,\"journal\":{\"name\":\"Annals of GIS\",\"volume\":\"9 1\",\"pages\":\"121 - 142\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2023-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of GIS\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/19475683.2023.2165544\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of GIS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/19475683.2023.2165544","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY","Score":null,"Total":0}
RS-DeepSuperLearner: fusion of CNN ensemble for remote sensing scene classification
ABSTRACT Scene classification is an important problem in remote sensing (RS) and has attracted a lot of research in the past decade. Nowadays, most proposed methods are based on deep convolutional neural network (CNN) models, and many pretrained CNN models have been investigated. Ensemble techniques are well studied in the machine learning community; however, few works have used them in RS scene classification. In this work, we propose an ensemble approach, called RS-DeepSuperLearner, that fuses the outputs of five advanced CNN models, namely, VGG16, Inception-V3, DenseNet121, InceptionResNet-V2, and EfficientNet-B3. First, we improve the architecture of the five CNN models by attaching an auxiliary branch at specific layer locations. In other words, the models now have two output layers producing predictions each and the final prediction is the average of the two. The RS-DeepSuperLearner method starts by fine-tuning the five CNN models using the training data. Then, it employs a deep neural network (DNN) SuperLearner to learn the best way for fusing the outputs of the five CNN models by training it on the predicted probability outputs and the cross-validation accuracies (per class) of the individual models. The proposed methodology was assessed on six publicly available RS datasets: UC Merced, KSA, RSSCN7, Optimal31, AID, and NWPU-RSC45. The experimental results demonstrate its superior capabilities when compared to state-of-the-art methods in the literature.