Mfpenet：用于遥感场景分类的多级前景感知增强网络

The Visual Computer Pub Date : 2024-08-13 DOI:10.1007/s00371-024-03587-w

Junding Sun, Chenxu Wang, Haifeng Sima, Xiaosheng Wu, Shuihua Wang, Yudong Zhang

{"title":"Mfpenet：用于遥感场景分类的多级前景感知增强网络","authors":"Junding Sun, Chenxu Wang, Haifeng Sima, Xiaosheng Wu, Shuihua Wang, Yudong Zhang","doi":"10.1007/s00371-024-03587-w","DOIUrl":null,"url":null,"abstract":"<p>Scene classification plays a vital role in the field of remote-sensing (RS). However, remote-sensing images have the essential properties of complex scene information and large-scale spatial changes, as well as the high similarity between various classes and the significant differences within the same class, which brings great challenges to scene classification. To address these issues, a multistage foreground-perception enhancement network (MFPENet) is proposed to enhance the ability to perceive foreground features, thereby improving classification accuracy. Firstly, to enrich the scene semantics of feature information, a multi-scale feature aggregation module is specifically designed using dilated convolution, which takes the features of different stages of the backbone network as input data to obtain enhanced multiscale features. Then, a novel foreground-perception enhancement module is designed to capture foreground information. Unlike the previous methods, we separate foreground features by designing feature masks and then innovatively explore the symbiotic relationship between foreground features and scene features to improve the recognition ability of foreground features further. Finally, a hierarchical attention module is designed to reduce the interference of redundant background details on classification. By embedding the dependence between adjacent level features into the attention mechanism, the model can pay more accurate attention to the key information. Redundancy is reduced, and the loss of useful information is minimized. Experiments on three public RS scene classification datasets [UC-Merced, Aerial Image Dataset, and NWPU-RESISC45] show that our method achieves highly competitive results. Future work will focus on utilizing the background features outside the effective foreground features in the image as a decision aid to improve the distinguishability between similar scenes. The source code of our proposed algorithm and the related datasets are available at https://github.com/Hpu-wcx/MFPENet.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"33 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mfpenet: multistage foreground-perception enhancement network for remote-sensing scene classification\",\"authors\":\"Junding Sun, Chenxu Wang, Haifeng Sima, Xiaosheng Wu, Shuihua Wang, Yudong Zhang\",\"doi\":\"10.1007/s00371-024-03587-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Scene classification plays a vital role in the field of remote-sensing (RS). However, remote-sensing images have the essential properties of complex scene information and large-scale spatial changes, as well as the high similarity between various classes and the significant differences within the same class, which brings great challenges to scene classification. To address these issues, a multistage foreground-perception enhancement network (MFPENet) is proposed to enhance the ability to perceive foreground features, thereby improving classification accuracy. Firstly, to enrich the scene semantics of feature information, a multi-scale feature aggregation module is specifically designed using dilated convolution, which takes the features of different stages of the backbone network as input data to obtain enhanced multiscale features. Then, a novel foreground-perception enhancement module is designed to capture foreground information. Unlike the previous methods, we separate foreground features by designing feature masks and then innovatively explore the symbiotic relationship between foreground features and scene features to improve the recognition ability of foreground features further. Finally, a hierarchical attention module is designed to reduce the interference of redundant background details on classification. By embedding the dependence between adjacent level features into the attention mechanism, the model can pay more accurate attention to the key information. Redundancy is reduced, and the loss of useful information is minimized. Experiments on three public RS scene classification datasets [UC-Merced, Aerial Image Dataset, and NWPU-RESISC45] show that our method achieves highly competitive results. Future work will focus on utilizing the background features outside the effective foreground features in the image as a decision aid to improve the distinguishability between similar scenes. The source code of our proposed algorithm and the related datasets are available at https://github.com/Hpu-wcx/MFPENet.</p>\",\"PeriodicalId\":501186,\"journal\":{\"name\":\"The Visual Computer\",\"volume\":\"33 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Visual Computer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00371-024-03587-w\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03587-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

场景分类在遥感（RS）领域发挥着重要作用。然而，遥感图像具有场景信息复杂、空间变化幅度大、不同类别之间相似性高、同一类别内部差异显著等基本特性，这给场景分类带来了巨大挑战。针对这些问题，提出了一种多级前景感知增强网络（MFPENet），以增强对前景特征的感知能力，从而提高分类精度。首先，为了丰富特征信息的场景语义，利用扩张卷积技术专门设计了一个多尺度特征聚合模块，将骨干网络不同阶段的特征作为输入数据，得到增强的多尺度特征。然后，我们设计了一个新颖的前景感知增强模块来捕捉前景信息。与以往方法不同的是，我们通过设计特征掩码来分离前景特征，然后创新性地探索前景特征与场景特征之间的共生关系，进一步提高前景特征的识别能力。最后，我们设计了分层注意力模块，以减少冗余背景细节对分类的干扰。通过将相邻层级特征之间的依赖关系嵌入关注机制，模型可以更准确地关注关键信息。冗余减少了，有用信息的损失也降到了最低。在三个公共 RS 场景分类数据集 [UC-Merced、Aerial Image Dataset 和 NWPU-RESISC45]上的实验表明，我们的方法取得了极具竞争力的结果。未来的工作将侧重于利用图像中有效前景特征之外的背景特征作为辅助决策工具，以提高相似场景之间的可区分性。我们提出的算法源代码和相关数据集可在 https://github.com/Hpu-wcx/MFPENet 网站上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Mfpenet: multistage foreground-perception enhancement network for remote-sensing scene classification

查看原文本刊更多论文

Mfpenet: multistage foreground-perception enhancement network for remote-sensing scene classification

Scene classification plays a vital role in the field of remote-sensing (RS). However, remote-sensing images have the essential properties of complex scene information and large-scale spatial changes, as well as the high similarity between various classes and the significant differences within the same class, which brings great challenges to scene classification. To address these issues, a multistage foreground-perception enhancement network (MFPENet) is proposed to enhance the ability to perceive foreground features, thereby improving classification accuracy. Firstly, to enrich the scene semantics of feature information, a multi-scale feature aggregation module is specifically designed using dilated convolution, which takes the features of different stages of the backbone network as input data to obtain enhanced multiscale features. Then, a novel foreground-perception enhancement module is designed to capture foreground information. Unlike the previous methods, we separate foreground features by designing feature masks and then innovatively explore the symbiotic relationship between foreground features and scene features to improve the recognition ability of foreground features further. Finally, a hierarchical attention module is designed to reduce the interference of redundant background details on classification. By embedding the dependence between adjacent level features into the attention mechanism, the model can pay more accurate attention to the key information. Redundancy is reduced, and the loss of useful information is minimized. Experiments on three public RS scene classification datasets [UC-Merced, Aerial Image Dataset, and NWPU-RESISC45] show that our method achieves highly competitive results. Future work will focus on utilizing the background features outside the effective foreground features in the image as a decision aid to improve the distinguishability between similar scenes. The source code of our proposed algorithm and the related datasets are available at https://github.com/Hpu-wcx/MFPENet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

The Visual Computer

自引率

0.00%

发文量