基于多尺度混合训练的感知光谱变压器展开网络用于任意尺度高光谱与多光谱图像融合

IF 15.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-04-11 DOI:10.1016/j.inffus.2025.103166

Bin Wang , Xingchuang Xiong , Yusheng Lian , Xuheng Cao , Han Zhou , Kun Yu , Zilong Liu

{"title":"基于多尺度混合训练的感知光谱变压器展开网络用于任意尺度高光谱与多光谱图像融合","authors":"Bin Wang , Xingchuang Xiong , Yusheng Lian , Xuheng Cao , Han Zhou , Kun Yu , Zilong Liu","doi":"10.1016/j.inffus.2025.103166","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral imaging offers rich spectral information but often suffers from a tradeoff between spatial and spectral resolutions owing to hardware limitations. To address this, hyperspectral image (HSI)-multispectral image (MSI) fusion techniques have emerged, which combines low-resolution HSI (LR-HSI) with high-resolution MSI (HR-MSI) to generate HR-HSI. However, existing methods often struggle with generalization across varying image resolutions and lack interpretability due to reliance on deep learning models without physical degradation constraints. This study introduces two major innovations to overcome these challenges: (1) a resolution-independent unfolding algorithm and a multiscale training framework, which allows flexible adaptation to LR-HSI of any resolution without increasing model complexity, thereby enhancing generalization in dynamic remote sensing environments and (2) a novel degradation design using real LR-HSI and HR-MSI as priors to guide spatial-spectral degradation and, in the external framework, implementing degradation constraints, thereby ensuring accurate approximation of true degradation processes. In addition, this study proposes a perceptive spectral transformer with perceptive spectral attention in a U-Net architecture to adaptively transfer spectral information, improving fusion accuracy. Experimental results highlight advantages of our approach under both single-scale training and multiscale mixed training conditions. Compared with eight state-of-the-art fusion algorithms, our approach demonstrates exceptional performance under single-scale training conditions. More importantly, through multiscale mixed training, its performance is further enhanced, achieving super-resolution magnifications ranging from 4 × to 128 ×, validating the effectiveness of the framework. Experiments demonstrated the robustness and practical applicability of the proposed approach in simulated and real-world scenarios; the code is available at <span><span>https://github.com/XWangBin/PSTUN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103166"},"PeriodicalIF":15.5000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Perceptive spectral transformer unfolding network with multiscale mixed training for arbitrary-scale hyperspectral and multispectral image fusion\",\"authors\":\"Bin Wang , Xingchuang Xiong , Yusheng Lian , Xuheng Cao , Han Zhou , Kun Yu , Zilong Liu\",\"doi\":\"10.1016/j.inffus.2025.103166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Hyperspectral imaging offers rich spectral information but often suffers from a tradeoff between spatial and spectral resolutions owing to hardware limitations. To address this, hyperspectral image (HSI)-multispectral image (MSI) fusion techniques have emerged, which combines low-resolution HSI (LR-HSI) with high-resolution MSI (HR-MSI) to generate HR-HSI. However, existing methods often struggle with generalization across varying image resolutions and lack interpretability due to reliance on deep learning models without physical degradation constraints. This study introduces two major innovations to overcome these challenges: (1) a resolution-independent unfolding algorithm and a multiscale training framework, which allows flexible adaptation to LR-HSI of any resolution without increasing model complexity, thereby enhancing generalization in dynamic remote sensing environments and (2) a novel degradation design using real LR-HSI and HR-MSI as priors to guide spatial-spectral degradation and, in the external framework, implementing degradation constraints, thereby ensuring accurate approximation of true degradation processes. In addition, this study proposes a perceptive spectral transformer with perceptive spectral attention in a U-Net architecture to adaptively transfer spectral information, improving fusion accuracy. Experimental results highlight advantages of our approach under both single-scale training and multiscale mixed training conditions. Compared with eight state-of-the-art fusion algorithms, our approach demonstrates exceptional performance under single-scale training conditions. More importantly, through multiscale mixed training, its performance is further enhanced, achieving super-resolution magnifications ranging from 4 × to 128 ×, validating the effectiveness of the framework. Experiments demonstrated the robustness and practical applicability of the proposed approach in simulated and real-world scenarios; the code is available at <span><span>https://github.com/XWangBin/PSTUN</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"122 \",\"pages\":\"Article 103166\"},\"PeriodicalIF\":15.5000,\"publicationDate\":\"2025-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253525002398\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002398","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

高光谱成像提供了丰富的光谱信息，但由于硬件的限制，经常在空间和光谱分辨率之间进行权衡。为了解决这一问题，高光谱图像(HSI)-多光谱图像（MSI）融合技术应运而生，该技术将低分辨率HSI （LR-HSI）与高分辨率MSI （HR-MSI）相结合，生成HR-HSI。然而，现有的方法往往难以在不同的图像分辨率上进行泛化，并且由于依赖于没有物理退化约束的深度学习模型而缺乏可解释性。本研究介绍了两个主要的创新来克服这些挑战：(1)独立于分辨率的展开算法和多尺度训练框架，该框架允许灵活地适应任何分辨率的LR-HSI，而不会增加模型复杂性，从而增强动态遥感环境中的泛化能力；(2)使用真实LR-HSI和HR-MSI作为先验指导空间光谱退化的新型退化设计，并在外部框架中实施退化约束。从而确保准确逼近真实的降解过程。此外，本研究还提出了一种基于U-Net架构的感知光谱转换器，该转换器具有感知光谱注意力，可自适应传递光谱信息，提高融合精度。实验结果显示了该方法在单尺度训练和多尺度混合训练条件下的优势。与八种最先进的融合算法相比，我们的方法在单尺度训练条件下表现出优异的性能。更重要的是，通过多尺度混合训练，进一步增强了其性能，实现了4 ~ 128倍的超分辨率放大，验证了框架的有效性。实验证明了该方法在模拟和现实场景中的鲁棒性和实用性；代码可在https://github.com/XWangBin/PSTUN上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Perceptive spectral transformer unfolding network with multiscale mixed training for arbitrary-scale hyperspectral and multispectral image fusion

Hyperspectral imaging offers rich spectral information but often suffers from a tradeoff between spatial and spectral resolutions owing to hardware limitations. To address this, hyperspectral image (HSI)-multispectral image (MSI) fusion techniques have emerged, which combines low-resolution HSI (LR-HSI) with high-resolution MSI (HR-MSI) to generate HR-HSI. However, existing methods often struggle with generalization across varying image resolutions and lack interpretability due to reliance on deep learning models without physical degradation constraints. This study introduces two major innovations to overcome these challenges: (1) a resolution-independent unfolding algorithm and a multiscale training framework, which allows flexible adaptation to LR-HSI of any resolution without increasing model complexity, thereby enhancing generalization in dynamic remote sensing environments and (2) a novel degradation design using real LR-HSI and HR-MSI as priors to guide spatial-spectral degradation and, in the external framework, implementing degradation constraints, thereby ensuring accurate approximation of true degradation processes. In addition, this study proposes a perceptive spectral transformer with perceptive spectral attention in a U-Net architecture to adaptively transfer spectral information, improving fusion accuracy. Experimental results highlight advantages of our approach under both single-scale training and multiscale mixed training conditions. Compared with eight state-of-the-art fusion algorithms, our approach demonstrates exceptional performance under single-scale training conditions. More importantly, through multiscale mixed training, its performance is further enhanced, achieving super-resolution magnifications ranging from 4 × to 128 ×, validating the effectiveness of the framework. Experiments demonstrated the robustness and practical applicability of the proposed approach in simulated and real-world scenarios; the code is available at https://github.com/XWangBin/PSTUN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.