Xiaoqing Wan , Kun Hu , Feng Chen , Yupeng He , Hui Liu
{"title":"Multi-scale feature enhancement and cross-dimensional attention mechanism fusion network for hyperspectral image classification","authors":"Xiaoqing Wan , Kun Hu , Feng Chen , Yupeng He , Hui Liu","doi":"10.1016/j.infrared.2025.106123","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral image (HSI) classification is widely used in various fields but faces significant challenges such as high dimensionality, spatial complexity, and spectral redundancy, which limit classification accuracy. Traditional machine learning methods rely on handcrafted features and struggle with high-dimensional data, while deep learning approaches still encounter difficulties in multi-level feature fusion, global-local collaborative modeling, and efficient cross-dimensional interaction. To address these challenges, this paper proposes the multi-scale feature enhancement and cross-dimensional attention mechanism fusion network (MSFE-CAMF), which integrates three key modules: a multi-scale feature extraction module (MS-FEM), a large-kernel attention and local feature fusion module (LKALFFM), and a cross-dimensional attention module (CDAM). First, the MS-FEM employs a multi-branch 3D convolutional architecture to capture multi-scale spatial dependencies and spectral correlations while maintaining computational efficiency. Additionally, a residual connection mechanism is incorporated to enhance model stability and convergence. Second, the LKALFFM combines large-kernel attention with local feature enhancement, facilitating cross-scale information capture through multi-scale learning. At the same time, it strengthens fine-grained feature sensitivity via local feature fusion, enabling a more refined representation of hyperspectral data. Finally, the CDAM module integrates a cross-dimensional attention mechanism with multi-pooling channel gating (MCG) to enhance multi-dimensional feature modeling of HSI through spatial-channel information interaction, improving classification accuracy and adaptability in complex scenarios. Extensive evaluations on four popular HSI datasets show that, with 10% of the training samples, the proposed method achieves 99.61% overall accuracy on the Houston 2013 dataset, 99.96% on the Salinas dataset, 99.92% on the WHU-Hi-LongKou dataset, and 99.90% on the WHU-Hi-HanChuan dataset. These results demonstrate the competitive performance of our architecture compared to state-of-the-art methods, underscoring its effectiveness and robustness in HSI classification.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"151 ","pages":"Article 106123"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525004165","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Hyperspectral image (HSI) classification is widely used in various fields but faces significant challenges such as high dimensionality, spatial complexity, and spectral redundancy, which limit classification accuracy. Traditional machine learning methods rely on handcrafted features and struggle with high-dimensional data, while deep learning approaches still encounter difficulties in multi-level feature fusion, global-local collaborative modeling, and efficient cross-dimensional interaction. To address these challenges, this paper proposes the multi-scale feature enhancement and cross-dimensional attention mechanism fusion network (MSFE-CAMF), which integrates three key modules: a multi-scale feature extraction module (MS-FEM), a large-kernel attention and local feature fusion module (LKALFFM), and a cross-dimensional attention module (CDAM). First, the MS-FEM employs a multi-branch 3D convolutional architecture to capture multi-scale spatial dependencies and spectral correlations while maintaining computational efficiency. Additionally, a residual connection mechanism is incorporated to enhance model stability and convergence. Second, the LKALFFM combines large-kernel attention with local feature enhancement, facilitating cross-scale information capture through multi-scale learning. At the same time, it strengthens fine-grained feature sensitivity via local feature fusion, enabling a more refined representation of hyperspectral data. Finally, the CDAM module integrates a cross-dimensional attention mechanism with multi-pooling channel gating (MCG) to enhance multi-dimensional feature modeling of HSI through spatial-channel information interaction, improving classification accuracy and adaptability in complex scenarios. Extensive evaluations on four popular HSI datasets show that, with 10% of the training samples, the proposed method achieves 99.61% overall accuracy on the Houston 2013 dataset, 99.96% on the Salinas dataset, 99.92% on the WHU-Hi-LongKou dataset, and 99.90% on the WHU-Hi-HanChuan dataset. These results demonstrate the competitive performance of our architecture compared to state-of-the-art methods, underscoring its effectiveness and robustness in HSI classification.
高光谱图像分类被广泛应用于各个领域,但面临着高维数、空间复杂性和光谱冗余等问题,限制了分类精度。传统的机器学习方法依赖于手工制作的特征,难以处理高维数据,而深度学习方法在多层次特征融合、全局-局部协同建模和高效的跨维交互方面仍然存在困难。为了解决这些问题,本文提出了多尺度特征增强与跨维度注意机制融合网络(MSFE-CAMF),该网络集成了三个关键模块:多尺度特征提取模块(MS-FEM)、大核注意与局部特征融合模块(LKALFFM)和跨维度注意模块(CDAM)。首先,MS-FEM采用多分支三维卷积架构来捕获多尺度空间依赖关系和谱相关性,同时保持计算效率。此外,还引入了残差连接机制,增强了模型的稳定性和收敛性。其次,LKALFFM将大核关注与局部特征增强相结合,通过多尺度学习实现跨尺度信息捕获。同时,它通过局部特征融合增强了细粒度特征的敏感性,使高光谱数据的表示更加精细。最后,CDAM模块将跨维关注机制与多池通道门控(MCG)相结合,通过空间通道信息交互增强HSI的多维特征建模,提高分类精度和复杂场景下的适应性。对四个流行的HSI数据集进行了广泛的评估,结果表明,使用10%的训练样本,该方法在休斯顿2013数据集上的总体准确率为99.61%,在萨利纳斯数据集上为99.96%,在whu - hi -龙口数据集上为99.92%,在whu - hi -汉川数据集上为99.90%。这些结果证明了我们的架构与最先进的方法相比具有竞争力的性能,强调了其在恒生指数分类中的有效性和稳健性。
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.