{"title":"HSI-MFormer: Integrating Mamba and Transformer Experts for Hyperspectral Image Classification","authors":"Yan He;Bing Tu;Bo Liu;Jun Li;Antonio Plaza","doi":"10.1109/TGRS.2025.3564167","DOIUrl":null,"url":null,"abstract":"Hyperspectral image (HSI) classification is fundamental to numerous remote sensing applications, enabling detailed analysis of material properties and environmental conditions. Recent Mamba built upon selective state space models (SSMs) (S6) have demonstrated exceptional advantages in long-range sequence modeling with linear computational efficiency, while Transformer based on self-attention mechanisms is particularly adept at capturing short-range dependencies. To leverage the complementary strengths of these models, this article introduces a novel hybrid Mamba-Transformer framework (HSI-MFormer), effectively exploring the multiscale properties of hyperspectral data for HSI classification. Initially, a multiscale token generation (MTG) module is developed, which converts the HSI cube into multiple spatial-spectral token groups across different scales. To adequately capture fine-grained multiscale spatial-spectral patterns, an inner-scale transformer expert (ITE) is designed, which incorporates grouped self-attention operations to perform short-range sequence modeling within token groups at each scale. Meanwhile, a cross-scale Mamba expert (CME) is introduced, which integrates a cross-scale serialization mechanism and bidirectional Mamba block for long-range sequence modeling, further exploring the interactions and complementarity between token groups across different scales. Several hybrid strategies for integrating the ITE and CME are investigated to maximize their complementarity, including parallel, interval, and serial structures. Extensive experiments demonstrate that the proposed HSI-MFormer significantly outperforms the state-of-the-art Transformer-based and Mamba-based HSI classification methods. The code is available at <uri>http://github.com/tubingnuist/HSI-MFormer</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10976442/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Hyperspectral image (HSI) classification is fundamental to numerous remote sensing applications, enabling detailed analysis of material properties and environmental conditions. Recent Mamba built upon selective state space models (SSMs) (S6) have demonstrated exceptional advantages in long-range sequence modeling with linear computational efficiency, while Transformer based on self-attention mechanisms is particularly adept at capturing short-range dependencies. To leverage the complementary strengths of these models, this article introduces a novel hybrid Mamba-Transformer framework (HSI-MFormer), effectively exploring the multiscale properties of hyperspectral data for HSI classification. Initially, a multiscale token generation (MTG) module is developed, which converts the HSI cube into multiple spatial-spectral token groups across different scales. To adequately capture fine-grained multiscale spatial-spectral patterns, an inner-scale transformer expert (ITE) is designed, which incorporates grouped self-attention operations to perform short-range sequence modeling within token groups at each scale. Meanwhile, a cross-scale Mamba expert (CME) is introduced, which integrates a cross-scale serialization mechanism and bidirectional Mamba block for long-range sequence modeling, further exploring the interactions and complementarity between token groups across different scales. Several hybrid strategies for integrating the ITE and CME are investigated to maximize their complementarity, including parallel, interval, and serial structures. Extensive experiments demonstrate that the proposed HSI-MFormer significantly outperforms the state-of-the-art Transformer-based and Mamba-based HSI classification methods. The code is available at http://github.com/tubingnuist/HSI-MFormer.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.