Xue Wang , Songling Yin , Xiaojun Xu , Yong Mei , Yan Huang , Kun Tan
{"title":"MHFu-former:一个多光谱和高光谱图像融合变压器","authors":"Xue Wang , Songling Yin , Xiaojun Xu , Yong Mei , Yan Huang , Kun Tan","doi":"10.1016/j.jag.2025.104843","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral images (HSIs) can capture detailed spectral features for object recognition, while multispectral images (MSIs) can provide a high spatial resolution for accurate object location. Deep learning methods have been widely applied in the fusion of hyperspectral and multispectral images, but still face challenges, including the limited capacity to enhance spatial details and preserve spectral information, as well as issues related to spatial scale dependency. In this paper, to solve the above problems and achieve more effective information integration between HSIs and MSIs, we propose a novel multispectral and hyperspectral image fusion transformer (MHFu-former). The proposed MHFu-former consists of two main components: (1) a feature extraction and fusion module, which first extracts deep multi-scale features from the hyperspectral and multispectral imagery and fuses them to form a joint feature map, which is then processed by a dual-branch structure consisting of a Swin transformer module and convolutional module to capture the global context and fine-grained spatial features, respectively; and (2) a spatial-spectral fusion attention mechanism, which adaptively enhances the important spectral information and fuses it with the spatial detail information, significantly boosting the model’s sensitivity to the key spectral features while preserving rich spatial details. We conducted comparative experiments on the indoor Cave dataset and the Shanghai and Ganzhou datasets from the ZY1-02D satellite to validate the effectiveness and superiority of the proposed method. Compared to the state-of-the-art methods, the proposed method significantly enhances the fusion performance across multiple key metrics, demonstrating its outstanding ability to process spatial and spectral details.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"143 ","pages":"Article 104843"},"PeriodicalIF":8.6000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MHFu-former: A multispectral and hyperspectral image fusion transformer\",\"authors\":\"Xue Wang , Songling Yin , Xiaojun Xu , Yong Mei , Yan Huang , Kun Tan\",\"doi\":\"10.1016/j.jag.2025.104843\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Hyperspectral images (HSIs) can capture detailed spectral features for object recognition, while multispectral images (MSIs) can provide a high spatial resolution for accurate object location. Deep learning methods have been widely applied in the fusion of hyperspectral and multispectral images, but still face challenges, including the limited capacity to enhance spatial details and preserve spectral information, as well as issues related to spatial scale dependency. In this paper, to solve the above problems and achieve more effective information integration between HSIs and MSIs, we propose a novel multispectral and hyperspectral image fusion transformer (MHFu-former). The proposed MHFu-former consists of two main components: (1) a feature extraction and fusion module, which first extracts deep multi-scale features from the hyperspectral and multispectral imagery and fuses them to form a joint feature map, which is then processed by a dual-branch structure consisting of a Swin transformer module and convolutional module to capture the global context and fine-grained spatial features, respectively; and (2) a spatial-spectral fusion attention mechanism, which adaptively enhances the important spectral information and fuses it with the spatial detail information, significantly boosting the model’s sensitivity to the key spectral features while preserving rich spatial details. We conducted comparative experiments on the indoor Cave dataset and the Shanghai and Ganzhou datasets from the ZY1-02D satellite to validate the effectiveness and superiority of the proposed method. Compared to the state-of-the-art methods, the proposed method significantly enhances the fusion performance across multiple key metrics, demonstrating its outstanding ability to process spatial and spectral details.</div></div>\",\"PeriodicalId\":73423,\"journal\":{\"name\":\"International journal of applied earth observation and geoinformation : ITC journal\",\"volume\":\"143 \",\"pages\":\"Article 104843\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of applied earth observation and geoinformation : ITC journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S156984322500490X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156984322500490X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
MHFu-former: A multispectral and hyperspectral image fusion transformer
Hyperspectral images (HSIs) can capture detailed spectral features for object recognition, while multispectral images (MSIs) can provide a high spatial resolution for accurate object location. Deep learning methods have been widely applied in the fusion of hyperspectral and multispectral images, but still face challenges, including the limited capacity to enhance spatial details and preserve spectral information, as well as issues related to spatial scale dependency. In this paper, to solve the above problems and achieve more effective information integration between HSIs and MSIs, we propose a novel multispectral and hyperspectral image fusion transformer (MHFu-former). The proposed MHFu-former consists of two main components: (1) a feature extraction and fusion module, which first extracts deep multi-scale features from the hyperspectral and multispectral imagery and fuses them to form a joint feature map, which is then processed by a dual-branch structure consisting of a Swin transformer module and convolutional module to capture the global context and fine-grained spatial features, respectively; and (2) a spatial-spectral fusion attention mechanism, which adaptively enhances the important spectral information and fuses it with the spatial detail information, significantly boosting the model’s sensitivity to the key spectral features while preserving rich spatial details. We conducted comparative experiments on the indoor Cave dataset and the Shanghai and Ganzhou datasets from the ZY1-02D satellite to validate the effectiveness and superiority of the proposed method. Compared to the state-of-the-art methods, the proposed method significantly enhances the fusion performance across multiple key metrics, demonstrating its outstanding ability to process spatial and spectral details.
期刊介绍:
The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.