A Hierarchical Graph-Enhanced Transformer Network for Remote Sensing Scene Classification

IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Ziwei Li;Weiming Xu;Shiyu Yang;Juan Wang;Hua Su;Zhanchao Huang;Sheng Wu
{"title":"A Hierarchical Graph-Enhanced Transformer Network for Remote Sensing Scene Classification","authors":"Ziwei Li;Weiming Xu;Shiyu Yang;Juan Wang;Hua Su;Zhanchao Huang;Sheng Wu","doi":"10.1109/JSTARS.2024.3491335","DOIUrl":null,"url":null,"abstract":"Remote sensing scene classification (RSSC) is essential in Earth observation, with applications in land use, environmental status, urban development, and disaster risk assessment. However, redundant background interference, varying feature scales, and high interclass similarity in remote sensing images present significant challenges for RSSC. To address these challenges, this article proposes a novel hierarchical graph-enhanced transformer network (HGTNet) for RSSC. Initially, we introduce a dual attention (DA) module, which extracts key feature information from both the channel and spatial domains, effectively suppressing background noise. Subsequently, we meticulously design a three-stage hierarchical transformer extractor, incorporating a DA module at the bottleneck of each stage to facilitate information exchange between different stages, in conjunction with the Swin transformer block to capture multiscale global visual information. Moreover, we develop a fine-grained graph neural network extractor that constructs the spatial topological relationships of pixel-level scene images, thereby aiding in the discrimination of similar complex scene categories. Finally, the visual features and spatial structural features are fully integrated and input into the classifier by employing skip connections. HGTNet achieves classification accuracies of 98.47%, 95.75%, and 96.33% on the aerial image, NWPU-RESISC45, and OPTIMAL-31 datasets, respectively, demonstrating superior performance compared to other state-of-the-art models. Extensive experimental results indicate that our proposed method effectively learns critical multiscale visual features and distinguishes between similar complex scenes, thereby significantly enhancing the accuracy of RSSC.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"20315-20330"},"PeriodicalIF":4.7000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10742489","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10742489/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Remote sensing scene classification (RSSC) is essential in Earth observation, with applications in land use, environmental status, urban development, and disaster risk assessment. However, redundant background interference, varying feature scales, and high interclass similarity in remote sensing images present significant challenges for RSSC. To address these challenges, this article proposes a novel hierarchical graph-enhanced transformer network (HGTNet) for RSSC. Initially, we introduce a dual attention (DA) module, which extracts key feature information from both the channel and spatial domains, effectively suppressing background noise. Subsequently, we meticulously design a three-stage hierarchical transformer extractor, incorporating a DA module at the bottleneck of each stage to facilitate information exchange between different stages, in conjunction with the Swin transformer block to capture multiscale global visual information. Moreover, we develop a fine-grained graph neural network extractor that constructs the spatial topological relationships of pixel-level scene images, thereby aiding in the discrimination of similar complex scene categories. Finally, the visual features and spatial structural features are fully integrated and input into the classifier by employing skip connections. HGTNet achieves classification accuracies of 98.47%, 95.75%, and 96.33% on the aerial image, NWPU-RESISC45, and OPTIMAL-31 datasets, respectively, demonstrating superior performance compared to other state-of-the-art models. Extensive experimental results indicate that our proposed method effectively learns critical multiscale visual features and distinguishes between similar complex scenes, thereby significantly enhancing the accuracy of RSSC.
用于遥感场景分类的层次图增强变压器网络
遥感场景分类(RSSC)在地球观测中至关重要,可应用于土地利用、环境状况、城市发展和灾害风险评估。然而,遥感图像中的冗余背景干扰、不同的特征尺度和高类间相似性给 RSSC 带来了巨大挑战。为了应对这些挑战,本文提出了一种用于 RSSC 的新型分层图增强变换器网络(HGTNet)。首先,我们引入了双注意(DA)模块,该模块可从信道域和空间域提取关键特征信息,有效抑制背景噪声。随后,我们精心设计了一个三级分层变换器提取器,在每一级的瓶颈处都加入了一个 DA 模块,以促进不同阶段之间的信息交换,并结合 Swin 变换器块来捕捉多尺度的全局视觉信息。此外,我们还开发了一种细粒度图神经网络提取器,用于构建像素级场景图像的空间拓扑关系,从而帮助分辨类似的复杂场景类别。最后,视觉特征和空间结构特征被充分整合,并通过跳转连接输入分类器。HGTNet 在航空图像、NWPU-RESISC45 和 OPTIMAL-31 数据集上的分类准确率分别达到了 98.47%、95.75% 和 96.33%,与其他最先进的模型相比表现出了卓越的性能。广泛的实验结果表明,我们提出的方法能有效学习关键的多尺度视觉特征,并区分相似的复杂场景,从而显著提高 RSSC 的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
9.30
自引率
10.90%
发文量
563
审稿时长
4.7 months
期刊介绍: The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信