SSSAT-Net: Spectral-Spatial Self-Attention-Based Transformer Network for hyperspectral image classification

IF 3.7 2区工程技术 Q2 OPTICS

Optics and Lasers in Engineering Pub Date : 2025-06-07 DOI:10.1016/j.optlaseng.2025.109154

Linsheng Huang , Lu Zhang , Chao Ruan , Jinling Zhao

{"title":"SSSAT-Net: Spectral-Spatial Self-Attention-Based Transformer Network for hyperspectral image classification","authors":"Linsheng Huang , Lu Zhang , Chao Ruan , Jinling Zhao","doi":"10.1016/j.optlaseng.2025.109154","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, Convolutional Neural Networks (CNNs) have demonstrated remarkable performance in hyperspectral image classification. However, due to the high-dimensional redundancy of hyperspectral data, different spectral bands contribute significantly varying relevance to classification tasks. Additionally, existing networks commonly adopt patch-based input strategies, which struggle to effectively model spatial correlations between central pixels and their neighboring regions. The local receptive fields inherent to traditional CNNs further limit their ability to capture global features. To address these challenges, this study proposes a Spectral-Spatial Self-Attention-Based Transformer Network (SSSAT). The model first employs Principal Component Analysis (PCA) to reduce dimensionality, followed by the Convolutional Block Attention Module (CBAM) to selectively extract discriminative spectral-spatial features. Subsequently, the Spectral Attention Module (SpeAM) integrates Squeeze-and-Excitation (SE) attention with depthwise separable convolution to achieve adaptive calibration of spectral bands. The Spatial Attention Module (SpaAM) is further constructed to enhance spatial feature representation through a Manhattan Distance based Feature Vector Self-similarity Attention module (MDSA), while the Multi-Scale Convolutional Information Fusion module (MSCIF) explores spatial characteristics at multiple scales. Additionally, the Transformer architecture is utilized to extract global spectral-spatial features. Finally, classification is performed via a linear layer. Experimental results on five public datasets (Indian Pines, Kennedy Space Center, Pavia University, Houston 2013 and Salinas) demonstrate that SSSAT achieves superior classification performance compared to state-of-the-art methods.</div></div>","PeriodicalId":49719,"journal":{"name":"Optics and Lasers in Engineering","volume":"194 ","pages":"Article 109154"},"PeriodicalIF":3.7000,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Lasers in Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143816625003392","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, Convolutional Neural Networks (CNNs) have demonstrated remarkable performance in hyperspectral image classification. However, due to the high-dimensional redundancy of hyperspectral data, different spectral bands contribute significantly varying relevance to classification tasks. Additionally, existing networks commonly adopt patch-based input strategies, which struggle to effectively model spatial correlations between central pixels and their neighboring regions. The local receptive fields inherent to traditional CNNs further limit their ability to capture global features. To address these challenges, this study proposes a Spectral-Spatial Self-Attention-Based Transformer Network (SSSAT). The model first employs Principal Component Analysis (PCA) to reduce dimensionality, followed by the Convolutional Block Attention Module (CBAM) to selectively extract discriminative spectral-spatial features. Subsequently, the Spectral Attention Module (SpeAM) integrates Squeeze-and-Excitation (SE) attention with depthwise separable convolution to achieve adaptive calibration of spectral bands. The Spatial Attention Module (SpaAM) is further constructed to enhance spatial feature representation through a Manhattan Distance based Feature Vector Self-similarity Attention module (MDSA), while the Multi-Scale Convolutional Information Fusion module (MSCIF) explores spatial characteristics at multiple scales. Additionally, the Transformer architecture is utilized to extract global spectral-spatial features. Finally, classification is performed via a linear layer. Experimental results on five public datasets (Indian Pines, Kennedy Space Center, Pavia University, Houston 2013 and Salinas) demonstrate that SSSAT achieves superior classification performance compared to state-of-the-art methods.

查看原文本刊更多论文

SSSAT-Net：基于光谱空间自关注的高光谱图像分类变压器网络

近年来，卷积神经网络（cnn）在高光谱图像分类中表现出了显著的性能。然而，由于高光谱数据的高维冗余，不同光谱波段对分类任务的相关性差异很大。此外，现有的网络通常采用基于补丁的输入策略，难以有效地模拟中心像素与其邻近区域之间的空间相关性。传统cnn固有的局部接受域进一步限制了它们捕捉全局特征的能力。为了应对这些挑战，本研究提出了一个基于频谱空间自注意力的变压器网络（SSSAT）。该模型首先采用主成分分析（PCA）降维，然后采用卷积块注意模块（CBAM）选择性提取判别光谱空间特征。随后，光谱注意模块（SpeAM）将压缩激励（SE）注意与深度可分卷积相结合，实现光谱波段的自适应校准。进一步构建空间注意模块（SpaAM），通过基于曼哈顿距离的特征向量自相似注意模块（MDSA）增强空间特征表征；构建多尺度卷积信息融合模块（MSCIF），探索多尺度空间特征。此外，利用Transformer架构提取全局光谱空间特征。最后，通过线性层进行分类。在五个公共数据集（Indian Pines, Kennedy Space Center, Pavia University， Houston 2013和Salinas）上的实验结果表明，与最先进的方法相比，SSSAT的分类性能更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Optics and Lasers in Engineering 工程技术-光学

CiteScore

8.90

自引率

8.70%

发文量

384

审稿时长

42 days

期刊介绍： Optics and Lasers in Engineering aims at providing an international forum for the interchange of information on the development of optical techniques and laser technology in engineering. Emphasis is placed on contributions targeted at the practical use of methods and devices, the development and enhancement of solutions and new theoretical concepts for experimental methods. Optics and Lasers in Engineering reflects the main areas in which optical methods are being used and developed for an engineering environment. Manuscripts should offer clear evidence of novelty and significance. Papers focusing on parameter optimization or computational issues are not suitable. Similarly, papers focussed on an application rather than the optical method fall outside the journal''s scope. The scope of the journal is defined to include the following: -Optical Metrology- Optical Methods for 3D visualization and virtual engineering- Optical Techniques for Microsystems- Imaging, Microscopy and Adaptive Optics- Computational Imaging- Laser methods in manufacturing- Integrated optical and photonic sensors- Optics and Photonics in Life Science- Hyperspectral and spectroscopic methods- Infrared and Terahertz techniques