{"title":"Densely aggregated U-net with spatial-spectral interaction transformer for hyperspectral compressed imaging reconstruction","authors":"Yun-Hui Li","doi":"10.1016/j.jvcir.2026.104795","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral imaging offers critical spectral information for applications such as material analysis and camouflage recognition. However, the acquisition of hyperspectral data cubes is inherently constrained by the Nyquist sampling theorem. While compressed sensing theory enables snapshot imaging by compressing the data cube into a 2D measurement, the ill-posed reconstruction remains a significant challenge. Recent deep learning methods, particularly vision transformers, have advanced the state-of-the-art (SOTA). Despite this, existing networks typically employ spectral or spatial self-attentions in isolation, blindly pursuing a global receptive field at the cost of computational efficiency and representational flexibility. Additionally, the vanilla skip connection in U-Nets is insufficient for effective multi-scale information transmission between encoder and decoder. To address these issues, we propose a Densely aggregated U-Net with a Spatial-Spectral Interaction Transformer (DSST). DSST parallelizes patch-based spectral self-attention and window-based spatial self-attention, complemented by an interaction mechanism. Furthermore, it introduces a densely aggregated skip connection to collect multi-scale features and bridge the semantic gap. Experimental results on both simulated and real-world scenes demonstrate that DSST achieves competitive performance with lower computational and memory costs compared to other end-to-end networks. Moreover, it offers faster inference speeds than deep unfolding networks.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"117 ","pages":"Article 104795"},"PeriodicalIF":3.1000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320326000908","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/27 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Hyperspectral imaging offers critical spectral information for applications such as material analysis and camouflage recognition. However, the acquisition of hyperspectral data cubes is inherently constrained by the Nyquist sampling theorem. While compressed sensing theory enables snapshot imaging by compressing the data cube into a 2D measurement, the ill-posed reconstruction remains a significant challenge. Recent deep learning methods, particularly vision transformers, have advanced the state-of-the-art (SOTA). Despite this, existing networks typically employ spectral or spatial self-attentions in isolation, blindly pursuing a global receptive field at the cost of computational efficiency and representational flexibility. Additionally, the vanilla skip connection in U-Nets is insufficient for effective multi-scale information transmission between encoder and decoder. To address these issues, we propose a Densely aggregated U-Net with a Spatial-Spectral Interaction Transformer (DSST). DSST parallelizes patch-based spectral self-attention and window-based spatial self-attention, complemented by an interaction mechanism. Furthermore, it introduces a densely aggregated skip connection to collect multi-scale features and bridge the semantic gap. Experimental results on both simulated and real-world scenes demonstrate that DSST achieves competitive performance with lower computational and memory costs compared to other end-to-end networks. Moreover, it offers faster inference speeds than deep unfolding networks.
期刊介绍:
The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.