Bin Wang , Xingchuang Xiong , Yusheng Lian , Xuheng Cao , Han Zhou , Kun Yu , Zilong Liu
{"title":"基于多尺度混合训练的感知光谱变压器展开网络用于任意尺度高光谱与多光谱图像融合","authors":"Bin Wang , Xingchuang Xiong , Yusheng Lian , Xuheng Cao , Han Zhou , Kun Yu , Zilong Liu","doi":"10.1016/j.inffus.2025.103166","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral imaging offers rich spectral information but often suffers from a tradeoff between spatial and spectral resolutions owing to hardware limitations. To address this, hyperspectral image (HSI)-multispectral image (MSI) fusion techniques have emerged, which combines low-resolution HSI (LR-HSI) with high-resolution MSI (HR-MSI) to generate HR-HSI. However, existing methods often struggle with generalization across varying image resolutions and lack interpretability due to reliance on deep learning models without physical degradation constraints. This study introduces two major innovations to overcome these challenges: (1) a resolution-independent unfolding algorithm and a multiscale training framework, which allows flexible adaptation to LR-HSI of any resolution without increasing model complexity, thereby enhancing generalization in dynamic remote sensing environments and (2) a novel degradation design using real LR-HSI and HR-MSI as priors to guide spatial-spectral degradation and, in the external framework, implementing degradation constraints, thereby ensuring accurate approximation of true degradation processes. In addition, this study proposes a perceptive spectral transformer with perceptive spectral attention in a U-Net architecture to adaptively transfer spectral information, improving fusion accuracy. Experimental results highlight advantages of our approach under both single-scale training and multiscale mixed training conditions. Compared with eight state-of-the-art fusion algorithms, our approach demonstrates exceptional performance under single-scale training conditions. More importantly, through multiscale mixed training, its performance is further enhanced, achieving super-resolution magnifications ranging from 4 × to 128 ×, validating the effectiveness of the framework. Experiments demonstrated the robustness and practical applicability of the proposed approach in simulated and real-world scenarios; the code is available at <span><span>https://github.com/XWangBin/PSTUN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103166"},"PeriodicalIF":15.5000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Perceptive spectral transformer unfolding network with multiscale mixed training for arbitrary-scale hyperspectral and multispectral image fusion\",\"authors\":\"Bin Wang , Xingchuang Xiong , Yusheng Lian , Xuheng Cao , Han Zhou , Kun Yu , Zilong Liu\",\"doi\":\"10.1016/j.inffus.2025.103166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Hyperspectral imaging offers rich spectral information but often suffers from a tradeoff between spatial and spectral resolutions owing to hardware limitations. To address this, hyperspectral image (HSI)-multispectral image (MSI) fusion techniques have emerged, which combines low-resolution HSI (LR-HSI) with high-resolution MSI (HR-MSI) to generate HR-HSI. However, existing methods often struggle with generalization across varying image resolutions and lack interpretability due to reliance on deep learning models without physical degradation constraints. This study introduces two major innovations to overcome these challenges: (1) a resolution-independent unfolding algorithm and a multiscale training framework, which allows flexible adaptation to LR-HSI of any resolution without increasing model complexity, thereby enhancing generalization in dynamic remote sensing environments and (2) a novel degradation design using real LR-HSI and HR-MSI as priors to guide spatial-spectral degradation and, in the external framework, implementing degradation constraints, thereby ensuring accurate approximation of true degradation processes. In addition, this study proposes a perceptive spectral transformer with perceptive spectral attention in a U-Net architecture to adaptively transfer spectral information, improving fusion accuracy. Experimental results highlight advantages of our approach under both single-scale training and multiscale mixed training conditions. Compared with eight state-of-the-art fusion algorithms, our approach demonstrates exceptional performance under single-scale training conditions. More importantly, through multiscale mixed training, its performance is further enhanced, achieving super-resolution magnifications ranging from 4 × to 128 ×, validating the effectiveness of the framework. Experiments demonstrated the robustness and practical applicability of the proposed approach in simulated and real-world scenarios; the code is available at <span><span>https://github.com/XWangBin/PSTUN</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"122 \",\"pages\":\"Article 103166\"},\"PeriodicalIF\":15.5000,\"publicationDate\":\"2025-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253525002398\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002398","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Perceptive spectral transformer unfolding network with multiscale mixed training for arbitrary-scale hyperspectral and multispectral image fusion
Hyperspectral imaging offers rich spectral information but often suffers from a tradeoff between spatial and spectral resolutions owing to hardware limitations. To address this, hyperspectral image (HSI)-multispectral image (MSI) fusion techniques have emerged, which combines low-resolution HSI (LR-HSI) with high-resolution MSI (HR-MSI) to generate HR-HSI. However, existing methods often struggle with generalization across varying image resolutions and lack interpretability due to reliance on deep learning models without physical degradation constraints. This study introduces two major innovations to overcome these challenges: (1) a resolution-independent unfolding algorithm and a multiscale training framework, which allows flexible adaptation to LR-HSI of any resolution without increasing model complexity, thereby enhancing generalization in dynamic remote sensing environments and (2) a novel degradation design using real LR-HSI and HR-MSI as priors to guide spatial-spectral degradation and, in the external framework, implementing degradation constraints, thereby ensuring accurate approximation of true degradation processes. In addition, this study proposes a perceptive spectral transformer with perceptive spectral attention in a U-Net architecture to adaptively transfer spectral information, improving fusion accuracy. Experimental results highlight advantages of our approach under both single-scale training and multiscale mixed training conditions. Compared with eight state-of-the-art fusion algorithms, our approach demonstrates exceptional performance under single-scale training conditions. More importantly, through multiscale mixed training, its performance is further enhanced, achieving super-resolution magnifications ranging from 4 × to 128 ×, validating the effectiveness of the framework. Experiments demonstrated the robustness and practical applicability of the proposed approach in simulated and real-world scenarios; the code is available at https://github.com/XWangBin/PSTUN.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.