{"title":"高光谱图像分类的多尺度空频域交叉变换","authors":"Cheng Shi;Pupu Chen;Li Fang;Minghua Zhao;Xinhong Hei;Qiguang Miao","doi":"10.1109/TIM.2025.3578703","DOIUrl":null,"url":null,"abstract":"Recently, the Transformer has achieved significant success in the hyperspectral image (HSI) classification task. However, most Transformers and their variants focus more on spatial-domain global feature learning, ignoring the complementary characteristics provided by frequency-domain features. The fast Fourier transform (FFT), due to its sensitivity to frequency-domain information, has become a primary tool for frequency-domain analysis. However, different frequency bands are often assigned the same attention values, and the differences between different frequency bands are not considered. To fully explore and fusion spatial- and frequency-domain features, we propose a multiscale spatial–frequency-domain cross-Transformer (SFDCT-Former) network. We design a two-branch structure for spatial-domain and frequency-domain feature learning: one branch utilizes the multihead self-attention (MHSA) module for spatial-domain feature learning, while the other incorporates a multifrequency-domain Transformer (MFre-Former) encoder for frequency-domain feature learning. The MFre-Former encoder divides the frequency domain into nonoverlapping frequency bands and assigns distinct attention to each frequency band, therefore, different frequency-domain information can be captured more precisely. Furthermore, to fuse the spatial- and frequency-domain features, we design a multilevel cross-attention (MLCA) fusion module. The MLCA module effectively combines spatial- and frequency-domain features at different levels to better capture their complementary characteristics. Extensive experiments conducted on four publicly available HSI datasets demonstrate that the proposed method outperforms nine state-of-the-art methods in classification performance. The code is available at <uri>https://github.com/AAAA-CS/SFDCT-Former</uri>","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-15"},"PeriodicalIF":5.6000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiscale Spatial–Frequency-Domain Cross-Transformer for Hyperspectral Image Classification\",\"authors\":\"Cheng Shi;Pupu Chen;Li Fang;Minghua Zhao;Xinhong Hei;Qiguang Miao\",\"doi\":\"10.1109/TIM.2025.3578703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, the Transformer has achieved significant success in the hyperspectral image (HSI) classification task. However, most Transformers and their variants focus more on spatial-domain global feature learning, ignoring the complementary characteristics provided by frequency-domain features. The fast Fourier transform (FFT), due to its sensitivity to frequency-domain information, has become a primary tool for frequency-domain analysis. However, different frequency bands are often assigned the same attention values, and the differences between different frequency bands are not considered. To fully explore and fusion spatial- and frequency-domain features, we propose a multiscale spatial–frequency-domain cross-Transformer (SFDCT-Former) network. We design a two-branch structure for spatial-domain and frequency-domain feature learning: one branch utilizes the multihead self-attention (MHSA) module for spatial-domain feature learning, while the other incorporates a multifrequency-domain Transformer (MFre-Former) encoder for frequency-domain feature learning. The MFre-Former encoder divides the frequency domain into nonoverlapping frequency bands and assigns distinct attention to each frequency band, therefore, different frequency-domain information can be captured more precisely. Furthermore, to fuse the spatial- and frequency-domain features, we design a multilevel cross-attention (MLCA) fusion module. The MLCA module effectively combines spatial- and frequency-domain features at different levels to better capture their complementary characteristics. Extensive experiments conducted on four publicly available HSI datasets demonstrate that the proposed method outperforms nine state-of-the-art methods in classification performance. The code is available at <uri>https://github.com/AAAA-CS/SFDCT-Former</uri>\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":\"74 \",\"pages\":\"1-15\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11053222/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11053222/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Multiscale Spatial–Frequency-Domain Cross-Transformer for Hyperspectral Image Classification
Recently, the Transformer has achieved significant success in the hyperspectral image (HSI) classification task. However, most Transformers and their variants focus more on spatial-domain global feature learning, ignoring the complementary characteristics provided by frequency-domain features. The fast Fourier transform (FFT), due to its sensitivity to frequency-domain information, has become a primary tool for frequency-domain analysis. However, different frequency bands are often assigned the same attention values, and the differences between different frequency bands are not considered. To fully explore and fusion spatial- and frequency-domain features, we propose a multiscale spatial–frequency-domain cross-Transformer (SFDCT-Former) network. We design a two-branch structure for spatial-domain and frequency-domain feature learning: one branch utilizes the multihead self-attention (MHSA) module for spatial-domain feature learning, while the other incorporates a multifrequency-domain Transformer (MFre-Former) encoder for frequency-domain feature learning. The MFre-Former encoder divides the frequency domain into nonoverlapping frequency bands and assigns distinct attention to each frequency band, therefore, different frequency-domain information can be captured more precisely. Furthermore, to fuse the spatial- and frequency-domain features, we design a multilevel cross-attention (MLCA) fusion module. The MLCA module effectively combines spatial- and frequency-domain features at different levels to better capture their complementary characteristics. Extensive experiments conducted on four publicly available HSI datasets demonstrate that the proposed method outperforms nine state-of-the-art methods in classification performance. The code is available at https://github.com/AAAA-CS/SFDCT-Former
期刊介绍:
Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.