基于信道掩模学习的光谱压缩成像混合变压器

IF 3.7 3区计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS

Journal of The Franklin Institute-engineering and Applied Mathematics Pub Date : 2025-03-24 DOI:10.1016/j.jfranklin.2025.107635

Wenyu Xie , Ping Xu , Haifeng Zheng , Yian Liu

{"title":"基于信道掩模学习的光谱压缩成像混合变压器","authors":"Wenyu Xie , Ping Xu , Haifeng Zheng , Yian Liu","doi":"10.1016/j.jfranklin.2025.107635","DOIUrl":null,"url":null,"abstract":"<div><div>Single disperser coded aperture spectral imaging (SD-CASSI) is well-known for its simple optical path that efficiently acquires spectral images. However, reconstructing hyperspectral images from their measurement scenes is an ill-posed and challenging problem. By applying deep learning methods to solve this ill-posed issue, it becomes possible to reconstruct high-quality hyperspectral images from measurement images in real time. However, mainstream models typically use an encoder–decoder structure, connecting the output of the encoder and the input of decoder only along the channels. This limits the ability of network to learn detailed image information. In addition, since the planar image sensor array causes varying wavelengths to experience different optical path differences after dispersion, the actual mask cannot be derived solely from a single known mask through different dispersion steps. To address these issues, this paper proposes a deep unfolding method called the channel-wise mask learning based mixing Transformer network (CML-MT). We design a denoising model based on window attention and a dual block, using the dual block as the decoder to fully utilize information from the encoder layers. Additionally, we introduce a channel-wise degradation mask learning module that implicitly learns to approximate the latent real mask under the constraint of multi-stage reprojection loss. Experimental results demonstrate that with these solutions, our model, extended to only three stages, is competitive with state-of-the-art models and excels in reconstructing details and textures in real-world scenarios.</div></div>","PeriodicalId":17283,"journal":{"name":"Journal of The Franklin Institute-engineering and Applied Mathematics","volume":"362 8","pages":"Article 107635"},"PeriodicalIF":3.7000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Channel-wise mask learning based mixing transformer for spectral compressive imaging\",\"authors\":\"Wenyu Xie , Ping Xu , Haifeng Zheng , Yian Liu\",\"doi\":\"10.1016/j.jfranklin.2025.107635\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Single disperser coded aperture spectral imaging (SD-CASSI) is well-known for its simple optical path that efficiently acquires spectral images. However, reconstructing hyperspectral images from their measurement scenes is an ill-posed and challenging problem. By applying deep learning methods to solve this ill-posed issue, it becomes possible to reconstruct high-quality hyperspectral images from measurement images in real time. However, mainstream models typically use an encoder–decoder structure, connecting the output of the encoder and the input of decoder only along the channels. This limits the ability of network to learn detailed image information. In addition, since the planar image sensor array causes varying wavelengths to experience different optical path differences after dispersion, the actual mask cannot be derived solely from a single known mask through different dispersion steps. To address these issues, this paper proposes a deep unfolding method called the channel-wise mask learning based mixing Transformer network (CML-MT). We design a denoising model based on window attention and a dual block, using the dual block as the decoder to fully utilize information from the encoder layers. Additionally, we introduce a channel-wise degradation mask learning module that implicitly learns to approximate the latent real mask under the constraint of multi-stage reprojection loss. Experimental results demonstrate that with these solutions, our model, extended to only three stages, is competitive with state-of-the-art models and excels in reconstructing details and textures in real-world scenarios.</div></div>\",\"PeriodicalId\":17283,\"journal\":{\"name\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"volume\":\"362 8\",\"pages\":\"Article 107635\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0016003225001292\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of The Franklin Institute-engineering and Applied Mathematics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016003225001292","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

单分散器编码孔径光谱成像（SD-CASSI）以其光路简单、获取光谱图像效率高而闻名。然而，从测量场景中重建高光谱图像是一个不适定和具有挑战性的问题。通过应用深度学习方法来解决这个不适定问题，可以从测量图像实时重建高质量的高光谱图像。然而，主流型号通常使用编码器-解码器结构，仅沿通道连接编码器的输出和解码器的输入。这限制了网络学习详细图像信息的能力。此外，由于平面图像传感器阵列使不同波长在色散后经历不同的光程差，因此实际的掩模不能仅由单个已知掩模通过不同色散步骤推导出来。为了解决这些问题，本文提出了一种深度展开方法，称为基于信道掩码学习的混合变压器网络（CML-MT）。我们设计了一个基于窗口注意和双块的去噪模型，使用双块作为解码器，充分利用了编码器层的信息。此外，我们还引入了一个信道退化掩码学习模块，该模块在多阶段重投影损失约束下隐式学习逼近潜在真实掩码。实验结果表明，通过这些解决方案，我们的模型仅扩展到三个阶段，与最先进的模型相比具有竞争力，并且在真实场景中重建细节和纹理方面表现出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Channel-wise mask learning based mixing transformer for spectral compressive imaging

Single disperser coded aperture spectral imaging (SD-CASSI) is well-known for its simple optical path that efficiently acquires spectral images. However, reconstructing hyperspectral images from their measurement scenes is an ill-posed and challenging problem. By applying deep learning methods to solve this ill-posed issue, it becomes possible to reconstruct high-quality hyperspectral images from measurement images in real time. However, mainstream models typically use an encoder–decoder structure, connecting the output of the encoder and the input of decoder only along the channels. This limits the ability of network to learn detailed image information. In addition, since the planar image sensor array causes varying wavelengths to experience different optical path differences after dispersion, the actual mask cannot be derived solely from a single known mask through different dispersion steps. To address these issues, this paper proposes a deep unfolding method called the channel-wise mask learning based mixing Transformer network (CML-MT). We design a denoising model based on window attention and a dual block, using the dual block as the decoder to fully utilize information from the encoder layers. Additionally, we introduce a channel-wise degradation mask learning module that implicitly learns to approximate the latent real mask under the constraint of multi-stage reprojection loss. Experimental results demonstrate that with these solutions, our model, extended to only three stages, is competitive with state-of-the-art models and excels in reconstructing details and textures in real-world scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of The Franklin Institute-engineering and Applied Mathematics 工程技术-工程：电子与电气

CiteScore

7.30

自引率

14.60%

发文量

586

审稿时长

6.9 months

期刊介绍： The Journal of The Franklin Institute has an established reputation for publishing high-quality papers in the field of engineering and applied mathematics. Its current focus is on control systems, complex networks and dynamic systems, signal processing and communications and their applications. All submitted papers are peer-reviewed. The Journal will publish original research papers and research review papers of substance. Papers and special focus issues are judged upon possible lasting value, which has been and continues to be the strength of the Journal of The Franklin Institute.