高光谱图像分类的多尺度空频域交叉变换

IF 5.6 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Instrumentation and Measurement Pub Date : 2025-06-26 DOI:10.1109/TIM.2025.3578703

Cheng Shi;Pupu Chen;Li Fang;Minghua Zhao;Xinhong Hei;Qiguang Miao

{"title":"高光谱图像分类的多尺度空频域交叉变换","authors":"Cheng Shi;Pupu Chen;Li Fang;Minghua Zhao;Xinhong Hei;Qiguang Miao","doi":"10.1109/TIM.2025.3578703","DOIUrl":null,"url":null,"abstract":"Recently, the Transformer has achieved significant success in the hyperspectral image (HSI) classification task. However, most Transformers and their variants focus more on spatial-domain global feature learning, ignoring the complementary characteristics provided by frequency-domain features. The fast Fourier transform (FFT), due to its sensitivity to frequency-domain information, has become a primary tool for frequency-domain analysis. However, different frequency bands are often assigned the same attention values, and the differences between different frequency bands are not considered. To fully explore and fusion spatial- and frequency-domain features, we propose a multiscale spatial–frequency-domain cross-Transformer (SFDCT-Former) network. We design a two-branch structure for spatial-domain and frequency-domain feature learning: one branch utilizes the multihead self-attention (MHSA) module for spatial-domain feature learning, while the other incorporates a multifrequency-domain Transformer (MFre-Former) encoder for frequency-domain feature learning. The MFre-Former encoder divides the frequency domain into nonoverlapping frequency bands and assigns distinct attention to each frequency band, therefore, different frequency-domain information can be captured more precisely. Furthermore, to fuse the spatial- and frequency-domain features, we design a multilevel cross-attention (MLCA) fusion module. The MLCA module effectively combines spatial- and frequency-domain features at different levels to better capture their complementary characteristics. Extensive experiments conducted on four publicly available HSI datasets demonstrate that the proposed method outperforms nine state-of-the-art methods in classification performance. The code is available at <uri>https://github.com/AAAA-CS/SFDCT-Former</uri>","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-15"},"PeriodicalIF":5.6000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiscale Spatial–Frequency-Domain Cross-Transformer for Hyperspectral Image Classification\",\"authors\":\"Cheng Shi;Pupu Chen;Li Fang;Minghua Zhao;Xinhong Hei;Qiguang Miao\",\"doi\":\"10.1109/TIM.2025.3578703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, the Transformer has achieved significant success in the hyperspectral image (HSI) classification task. However, most Transformers and their variants focus more on spatial-domain global feature learning, ignoring the complementary characteristics provided by frequency-domain features. The fast Fourier transform (FFT), due to its sensitivity to frequency-domain information, has become a primary tool for frequency-domain analysis. However, different frequency bands are often assigned the same attention values, and the differences between different frequency bands are not considered. To fully explore and fusion spatial- and frequency-domain features, we propose a multiscale spatial–frequency-domain cross-Transformer (SFDCT-Former) network. We design a two-branch structure for spatial-domain and frequency-domain feature learning: one branch utilizes the multihead self-attention (MHSA) module for spatial-domain feature learning, while the other incorporates a multifrequency-domain Transformer (MFre-Former) encoder for frequency-domain feature learning. The MFre-Former encoder divides the frequency domain into nonoverlapping frequency bands and assigns distinct attention to each frequency band, therefore, different frequency-domain information can be captured more precisely. Furthermore, to fuse the spatial- and frequency-domain features, we design a multilevel cross-attention (MLCA) fusion module. The MLCA module effectively combines spatial- and frequency-domain features at different levels to better capture their complementary characteristics. Extensive experiments conducted on four publicly available HSI datasets demonstrate that the proposed method outperforms nine state-of-the-art methods in classification performance. The code is available at <uri>https://github.com/AAAA-CS/SFDCT-Former</uri>\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":\"74 \",\"pages\":\"1-15\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11053222/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11053222/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

最近，Transformer在高光谱图像（HSI）分类任务中取得了显著的成功。然而，大多数变形器及其变体更多地关注于空域全局特征学习，而忽略了频域特征提供的互补特征。快速傅里叶变换（FFT）由于其对频域信息的敏感性，已成为频域分析的主要工具。然而，不同的频段往往被赋予相同的关注值，而没有考虑不同频段之间的差异。为了充分挖掘和融合空间和频域特征，我们提出了一个多尺度空间-频域交叉变压器（SFDCT-Former）网络。我们设计了一个用于空域和频域特征学习的双分支结构：一个分支使用多头自注意（MHSA）模块进行空域特征学习，而另一个分支使用多频域变压器（mfreformer）编码器进行频域特征学习。mfr - former编码器将频域划分为不重叠的频段，并对每个频段给予不同的关注，从而可以更精确地捕获不同的频域信息。此外，为了融合空间域和频域特征，我们设计了多级交叉注意（MLCA）融合模块。MLCA模块有效地结合了不同层次的空间域和频域特征，以更好地捕捉它们的互补特征。在四个公开可用的HSI数据集上进行的大量实验表明，所提出的方法在分类性能上优于九种最先进的方法。代码可在https://github.com/AAAA-CS/SFDCT-Former上获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multiscale Spatial–Frequency-Domain Cross-Transformer for Hyperspectral Image Classification

Recently, the Transformer has achieved significant success in the hyperspectral image (HSI) classification task. However, most Transformers and their variants focus more on spatial-domain global feature learning, ignoring the complementary characteristics provided by frequency-domain features. The fast Fourier transform (FFT), due to its sensitivity to frequency-domain information, has become a primary tool for frequency-domain analysis. However, different frequency bands are often assigned the same attention values, and the differences between different frequency bands are not considered. To fully explore and fusion spatial- and frequency-domain features, we propose a multiscale spatial–frequency-domain cross-Transformer (SFDCT-Former) network. We design a two-branch structure for spatial-domain and frequency-domain feature learning: one branch utilizes the multihead self-attention (MHSA) module for spatial-domain feature learning, while the other incorporates a multifrequency-domain Transformer (MFre-Former) encoder for frequency-domain feature learning. The MFre-Former encoder divides the frequency domain into nonoverlapping frequency bands and assigns distinct attention to each frequency band, therefore, different frequency-domain information can be captured more precisely. Furthermore, to fuse the spatial- and frequency-domain features, we design a multilevel cross-attention (MLCA) fusion module. The MLCA module effectively combines spatial- and frequency-domain features at different levels to better capture their complementary characteristics. Extensive experiments conducted on four publicly available HSI datasets demonstrate that the proposed method outperforms nine state-of-the-art methods in classification performance. The code is available at https://github.com/AAAA-CS/SFDCT-Former

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Instrumentation and Measurement 工程技术-工程：电子与电气

CiteScore

9.00

自引率

23.20%

发文量

1294

审稿时长

3.9 months

期刊介绍： Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.