{"title":"用简单的注意力取代复杂的变压器,实现高光谱和多光谱图像融合","authors":"Kunpeng Mu , Wenqing Wang , Mingze Gao , Han Liu","doi":"10.1016/j.engappai.2025.111959","DOIUrl":null,"url":null,"abstract":"<div><div>The fusion of hyperspectral (HSI) and multispectral (MSI) images to obtain high-resolution hyperspectral images is crucial for hyperspectral image processing and interpretation. In recent years, the Transformer architecture is extensively utilized in the domain of HSI-MSI fusion, yielding promising results. However, the introduction of various modified Transformer architectures leads to increasingly complex network structures, which impose significant computational resource demands. To address this issue, this paper proposes a Mask and Cross-Attention Network (MCANet) for HSI and MSI fusion. The network comprises three components: masked feature extraction, channel and spatial attention cross-feature fusion, and multi-scale step-by-step reconstruction. Our network abandons the existing patchwork of various advanced attention mechanisms and instead employs only the most straightforward channel spectrum and spatial attention mechanisms. This approach allows us to thoroughly extract spectral and spatial features while minimizing the computational resources required by the model. We conduct experiments on 8 HSI datasets and compare them with state-of-the-art fusion methods. The Pavia Center, Pavia University, Washington DC, Salinas, Houston and Botswana datasets all achieve the best fusion images and metrics. In addition, efficiency experiments confirm that the proposed method saves a significant amount of computational resources while fusing high-quality images. The source code and pre-trained models are available at <span><span>https://github.com/xiaomudsg/MCANet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111959"},"PeriodicalIF":8.0000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Replacing complex transformer with simple attention to achieve hyperspectral and multispectral image fusion\",\"authors\":\"Kunpeng Mu , Wenqing Wang , Mingze Gao , Han Liu\",\"doi\":\"10.1016/j.engappai.2025.111959\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The fusion of hyperspectral (HSI) and multispectral (MSI) images to obtain high-resolution hyperspectral images is crucial for hyperspectral image processing and interpretation. In recent years, the Transformer architecture is extensively utilized in the domain of HSI-MSI fusion, yielding promising results. However, the introduction of various modified Transformer architectures leads to increasingly complex network structures, which impose significant computational resource demands. To address this issue, this paper proposes a Mask and Cross-Attention Network (MCANet) for HSI and MSI fusion. The network comprises three components: masked feature extraction, channel and spatial attention cross-feature fusion, and multi-scale step-by-step reconstruction. Our network abandons the existing patchwork of various advanced attention mechanisms and instead employs only the most straightforward channel spectrum and spatial attention mechanisms. This approach allows us to thoroughly extract spectral and spatial features while minimizing the computational resources required by the model. We conduct experiments on 8 HSI datasets and compare them with state-of-the-art fusion methods. The Pavia Center, Pavia University, Washington DC, Salinas, Houston and Botswana datasets all achieve the best fusion images and metrics. In addition, efficiency experiments confirm that the proposed method saves a significant amount of computational resources while fusing high-quality images. The source code and pre-trained models are available at <span><span>https://github.com/xiaomudsg/MCANet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"160 \",\"pages\":\"Article 111959\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625019670\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625019670","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Replacing complex transformer with simple attention to achieve hyperspectral and multispectral image fusion
The fusion of hyperspectral (HSI) and multispectral (MSI) images to obtain high-resolution hyperspectral images is crucial for hyperspectral image processing and interpretation. In recent years, the Transformer architecture is extensively utilized in the domain of HSI-MSI fusion, yielding promising results. However, the introduction of various modified Transformer architectures leads to increasingly complex network structures, which impose significant computational resource demands. To address this issue, this paper proposes a Mask and Cross-Attention Network (MCANet) for HSI and MSI fusion. The network comprises three components: masked feature extraction, channel and spatial attention cross-feature fusion, and multi-scale step-by-step reconstruction. Our network abandons the existing patchwork of various advanced attention mechanisms and instead employs only the most straightforward channel spectrum and spatial attention mechanisms. This approach allows us to thoroughly extract spectral and spatial features while minimizing the computational resources required by the model. We conduct experiments on 8 HSI datasets and compare them with state-of-the-art fusion methods. The Pavia Center, Pavia University, Washington DC, Salinas, Houston and Botswana datasets all achieve the best fusion images and metrics. In addition, efficiency experiments confirm that the proposed method saves a significant amount of computational resources while fusing high-quality images. The source code and pre-trained models are available at https://github.com/xiaomudsg/MCANet.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.