{"title":"AFIMNet:用于遥感场景分类的自适应特征交互网络","authors":"Xiao Wang;Yisha Sun;Pan He","doi":"10.1109/LGRS.2025.3607205","DOIUrl":null,"url":null,"abstract":"Convolutional neural network (CNN)-based methods have been widely applied in remote sensing scene classification (RSSC) and have achieved remarkable classification results. However, traditional CNN methods have certain limitations in extracting global features and capturing image semantics, especially in complex remote sensing (RS) image scenes. The Transformer can directly capture global features through the self-attention mechanism, but its performance is weaker when handling local details. Currently, methods that directly combine CNN and transformer features lead to feature imbalance and introduce redundant information. To address these issues, we propose AFIMNet, an adaptive feature interaction network for RSSC. First, we use a dual-branch network structure (based on ResNet34 and Swin-S) to extract local and global features from RS scene images. Second, we design an adaptive feature interaction module (AFIM) that effectively enhances the interaction and correlation between local and global features. Third, we use a spatial-channel fusion module (SCFM) to aggregate the interacted features, further strengthening feature representation capabilities. Our proposed method is validated on three public RS datasets, and experimental results show that AFIMNet has a stronger feature representation ability compared to current popular RS image classification methods, significantly improving classification accuracy. The source code will be publicly accessible at <uri>https://github.com/xavi276310/AFIMNet</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AFIMNet: An Adaptive Feature Interaction Network for Remote Sensing Scene Classification\",\"authors\":\"Xiao Wang;Yisha Sun;Pan He\",\"doi\":\"10.1109/LGRS.2025.3607205\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural network (CNN)-based methods have been widely applied in remote sensing scene classification (RSSC) and have achieved remarkable classification results. However, traditional CNN methods have certain limitations in extracting global features and capturing image semantics, especially in complex remote sensing (RS) image scenes. The Transformer can directly capture global features through the self-attention mechanism, but its performance is weaker when handling local details. Currently, methods that directly combine CNN and transformer features lead to feature imbalance and introduce redundant information. To address these issues, we propose AFIMNet, an adaptive feature interaction network for RSSC. First, we use a dual-branch network structure (based on ResNet34 and Swin-S) to extract local and global features from RS scene images. Second, we design an adaptive feature interaction module (AFIM) that effectively enhances the interaction and correlation between local and global features. Third, we use a spatial-channel fusion module (SCFM) to aggregate the interacted features, further strengthening feature representation capabilities. Our proposed method is validated on three public RS datasets, and experimental results show that AFIMNet has a stronger feature representation ability compared to current popular RS image classification methods, significantly improving classification accuracy. The source code will be publicly accessible at <uri>https://github.com/xavi276310/AFIMNet</uri>\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11153448/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11153448/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
AFIMNet: An Adaptive Feature Interaction Network for Remote Sensing Scene Classification
Convolutional neural network (CNN)-based methods have been widely applied in remote sensing scene classification (RSSC) and have achieved remarkable classification results. However, traditional CNN methods have certain limitations in extracting global features and capturing image semantics, especially in complex remote sensing (RS) image scenes. The Transformer can directly capture global features through the self-attention mechanism, but its performance is weaker when handling local details. Currently, methods that directly combine CNN and transformer features lead to feature imbalance and introduce redundant information. To address these issues, we propose AFIMNet, an adaptive feature interaction network for RSSC. First, we use a dual-branch network structure (based on ResNet34 and Swin-S) to extract local and global features from RS scene images. Second, we design an adaptive feature interaction module (AFIM) that effectively enhances the interaction and correlation between local and global features. Third, we use a spatial-channel fusion module (SCFM) to aggregate the interacted features, further strengthening feature representation capabilities. Our proposed method is validated on three public RS datasets, and experimental results show that AFIMNet has a stronger feature representation ability compared to current popular RS image classification methods, significantly improving classification accuracy. The source code will be publicly accessible at https://github.com/xavi276310/AFIMNet