{"title":"MCAFNet: Multi-Channel Attention Fusion Network-Based CNN For Remote Sensing Scene Classification","authors":"Jingming Xia, Yao Zhou, Ling Tan, Yue Ding","doi":"10.14358/pers.22-00121r2","DOIUrl":null,"url":null,"abstract":"Remote sensing scene images are characterized by intra-class diversity and inter-class similarity. When recognizing remote sensing images, traditional image classification algorithms based on deep learning only extract the global features of scene images, ignoring the important role\n of local key features in classification, which limits the ability of feature expression and restricts the improvement of classification accuracy. Therefore, this paper presents a multi-channel attention fusion network (MCAFNet). First, three channels are used to extract the features of the\n image. The channel \"spatial attention module\" is added after the maximum pooling layer of two channels to get the global and local key features of the image. The other channel uses the original model to extract the deep features of the image. Second, features extracted from different channels\n are effectively fused by the fusion module. Finally, an adaptive weight loss function is designed to automatically adjust the losses in different types of loss functions. Three challenging data sets, UC Merced Land-Use Dataset (UCM), Aerial Image Dataset (AID), and Northwestern Polytechnic\n University Dataset (NWPU), are selected for the experiment. Experimental results show that our algorithm can effectively recognize scenes and obtain competitive classification results.","PeriodicalId":211256,"journal":{"name":"Photogrammetric Engineering & Remote Sensing","volume":"155 10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Photogrammetric Engineering & Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14358/pers.22-00121r2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Remote sensing scene images are characterized by intra-class diversity and inter-class similarity. When recognizing remote sensing images, traditional image classification algorithms based on deep learning only extract the global features of scene images, ignoring the important role
of local key features in classification, which limits the ability of feature expression and restricts the improvement of classification accuracy. Therefore, this paper presents a multi-channel attention fusion network (MCAFNet). First, three channels are used to extract the features of the
image. The channel "spatial attention module" is added after the maximum pooling layer of two channels to get the global and local key features of the image. The other channel uses the original model to extract the deep features of the image. Second, features extracted from different channels
are effectively fused by the fusion module. Finally, an adaptive weight loss function is designed to automatically adjust the losses in different types of loss functions. Three challenging data sets, UC Merced Land-Use Dataset (UCM), Aerial Image Dataset (AID), and Northwestern Polytechnic
University Dataset (NWPU), are selected for the experiment. Experimental results show that our algorithm can effectively recognize scenes and obtain competitive classification results.