{"title":"基于知识蒸馏的遥感图像无参数关注模型场景分类","authors":"Yubing Han, Zongyin Liu, Jiguo Yu, Anming Dong, Huihui Zhang","doi":"10.1109/MSN57253.2022.00077","DOIUrl":null,"url":null,"abstract":"Remote sensing image scene classification is to label remote sensing images as a specific scene category by understanding the semantic information of the images. It is an essential link in remote sensing image analysis and interpretation and has important research value. Convolutional neural networks (CNNs) have been dominant in remote sensing image scene classification due to their powerful feature extraction capabilities. The general trend has been to make deeper and wider CNN architectures to achieve higher classification accuracy. However, these advances to improve accuracy enlarge the network, creating too many parameters and high computational costs. Large models are difficult to deploy on resource-constrained edge devices for practical applications. Furthermore, CNNs can effectively capture local information but are weak in extracting global features. To overcome these drawbacks, we propose a novel knowledge distillation (KD) based method by employing Swin Transformer as a teacher network for guiding MobileNetV2 with Parameter-Free Attention (MobileNetV2-PFA). First, we modify MobileNetV2 by introducing PFA into the inverted bottleneck block; this improvement helps the model learn more latent and robust features without extra parameters. Second, Swin Transformer is an excellent architecture for capturing long-range dependencies via shifted window-based attention. So, we utilize the long-range dependency information from the Swin Transformer to assist MobileNetV2-PFA training through KD. Experimental results on the challenging NWPU-RESISC45 dataset show that the proposed method outperforms the original MobileNetV2 in classification accuracy with low computational consumption.","PeriodicalId":114459,"journal":{"name":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scene Classification Through Knowledge Distillation Enabled Parameter-Free Attention Model for Remote Sensing Images\",\"authors\":\"Yubing Han, Zongyin Liu, Jiguo Yu, Anming Dong, Huihui Zhang\",\"doi\":\"10.1109/MSN57253.2022.00077\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Remote sensing image scene classification is to label remote sensing images as a specific scene category by understanding the semantic information of the images. It is an essential link in remote sensing image analysis and interpretation and has important research value. Convolutional neural networks (CNNs) have been dominant in remote sensing image scene classification due to their powerful feature extraction capabilities. The general trend has been to make deeper and wider CNN architectures to achieve higher classification accuracy. However, these advances to improve accuracy enlarge the network, creating too many parameters and high computational costs. Large models are difficult to deploy on resource-constrained edge devices for practical applications. Furthermore, CNNs can effectively capture local information but are weak in extracting global features. To overcome these drawbacks, we propose a novel knowledge distillation (KD) based method by employing Swin Transformer as a teacher network for guiding MobileNetV2 with Parameter-Free Attention (MobileNetV2-PFA). First, we modify MobileNetV2 by introducing PFA into the inverted bottleneck block; this improvement helps the model learn more latent and robust features without extra parameters. Second, Swin Transformer is an excellent architecture for capturing long-range dependencies via shifted window-based attention. So, we utilize the long-range dependency information from the Swin Transformer to assist MobileNetV2-PFA training through KD. Experimental results on the challenging NWPU-RESISC45 dataset show that the proposed method outperforms the original MobileNetV2 in classification accuracy with low computational consumption.\",\"PeriodicalId\":114459,\"journal\":{\"name\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSN57253.2022.00077\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN57253.2022.00077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scene Classification Through Knowledge Distillation Enabled Parameter-Free Attention Model for Remote Sensing Images
Remote sensing image scene classification is to label remote sensing images as a specific scene category by understanding the semantic information of the images. It is an essential link in remote sensing image analysis and interpretation and has important research value. Convolutional neural networks (CNNs) have been dominant in remote sensing image scene classification due to their powerful feature extraction capabilities. The general trend has been to make deeper and wider CNN architectures to achieve higher classification accuracy. However, these advances to improve accuracy enlarge the network, creating too many parameters and high computational costs. Large models are difficult to deploy on resource-constrained edge devices for practical applications. Furthermore, CNNs can effectively capture local information but are weak in extracting global features. To overcome these drawbacks, we propose a novel knowledge distillation (KD) based method by employing Swin Transformer as a teacher network for guiding MobileNetV2 with Parameter-Free Attention (MobileNetV2-PFA). First, we modify MobileNetV2 by introducing PFA into the inverted bottleneck block; this improvement helps the model learn more latent and robust features without extra parameters. Second, Swin Transformer is an excellent architecture for capturing long-range dependencies via shifted window-based attention. So, we utilize the long-range dependency information from the Swin Transformer to assist MobileNetV2-PFA training through KD. Experimental results on the challenging NWPU-RESISC45 dataset show that the proposed method outperforms the original MobileNetV2 in classification accuracy with low computational consumption.