{"title":"Multi-attention aggregation network for remote sensing scene classification","authors":"Xin Wang, Yingying Li, Aiye Shi, Huiyu Zhou","doi":"10.1117/1.JRS.17.046508","DOIUrl":null,"url":null,"abstract":"Abstract. Remote sensing (RS) scene classification is a highly challenging task because of the unique characteristics of RS scenes, such as high intra-class variability, large inter-class similarity, and various objects with different scales. Attention, interpreted as an important mechanism of the human visual system, can emphasize meaningful features of deep neural networks, which is beneficial for boosting the classification performance. Motivated by it, we present a multi-attention aggregation network (MAANet), which contains various specially designed attention models, for precise RS scene classification. First, a gated attention fluid coding structure is constructed for mining hierarchical gated attention features from RS images. Second, a progressive pyramid refinement architecture is designed to explore correlations of cross-layer attention features to learn enhanced multi-scale representations. Third, a two-stream attention aggregation structure, equipped with three different attention models, is developed to guide the generation of aggregated features. Finally, a scene label prediction module is proposed for scene label prediction. We conduct extensive experiments on three famous RS scene datasets, and the experimental results show that our MAANet outperforms a number of current representative state-of-the-art approaches for the RS scene classification task.","PeriodicalId":54879,"journal":{"name":"Journal of Applied Remote Sensing","volume":"13 1","pages":"046508 - 046508"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1117/1.JRS.17.046508","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract. Remote sensing (RS) scene classification is a highly challenging task because of the unique characteristics of RS scenes, such as high intra-class variability, large inter-class similarity, and various objects with different scales. Attention, interpreted as an important mechanism of the human visual system, can emphasize meaningful features of deep neural networks, which is beneficial for boosting the classification performance. Motivated by it, we present a multi-attention aggregation network (MAANet), which contains various specially designed attention models, for precise RS scene classification. First, a gated attention fluid coding structure is constructed for mining hierarchical gated attention features from RS images. Second, a progressive pyramid refinement architecture is designed to explore correlations of cross-layer attention features to learn enhanced multi-scale representations. Third, a two-stream attention aggregation structure, equipped with three different attention models, is developed to guide the generation of aggregated features. Finally, a scene label prediction module is proposed for scene label prediction. We conduct extensive experiments on three famous RS scene datasets, and the experimental results show that our MAANet outperforms a number of current representative state-of-the-art approaches for the RS scene classification task.
期刊介绍:
The Journal of Applied Remote Sensing is a peer-reviewed journal that optimizes the communication of concepts, information, and progress among the remote sensing community.