{"title":"Progressive Self-Supervised Learning for CASSI Computational Spectral Cameras","authors":"Xiaoyin Mei;Yuqi Li;Qiang Fu;Wolfgang Heidrich","doi":"10.1109/TCI.2024.3463478","DOIUrl":"10.1109/TCI.2024.3463478","url":null,"abstract":"Compressive spectral imaging (CSI) is a technique used to capture high-dimensional hyperspectral images (HSIs) with a few multiplexed measurements, thereby reducing data acquisition costs and complexity. However, existing CSI methods often rely on end-to-end learning from training sets, which may struggle to generalize well to unseen scenes and phenomena. In this paper, we present a progressive self-supervised method specifically tailored for coded aperture snapshot spectral imaging (CASSI). Our proposed method enables HSI reconstruction solely from the measurements, without requiring any ground truth spectral data. To achieve this, we integrate positional encoding and spectral cluster-centroid features within a novel progressive training framework. Additionally, we employ an attention mechanism and a multi-scale architecture to enhance the robustness and accuracy of HSI reconstruction. Through extensive experiments on both synthetic and real datasets, we validate the effectiveness of our method. Our results demonstrate significantly superior performance compared to state-of-the-art self-supervised CASSI methods, while utilizing fewer parameters and consuming less memory. Furthermore, our proposed approach showcases competitive performance in terms of reconstruction quality when compared to state-of-the-art supervised methods.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1505-1518"},"PeriodicalIF":4.2,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142249424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DS$^{2}$PN: A Two-Stage Direction-Aware Spectral-Spatial Perceptual Network for Hyperspectral Image Reconstruction","authors":"Tiecheng Song;Zheng Zhang;Kaizhao Zhang;Anyong Qin;Feng Yang;Chenqiang Gao","doi":"10.1109/TCI.2024.3458421","DOIUrl":"10.1109/TCI.2024.3458421","url":null,"abstract":"Coded aperture snapshot spectral imaging (CASSI) systems are designed to modulate and compress 3D hyperspectral images (HSIs) into 2D measurements, which can capture HSIs in dynamic scenes. How to faithfully recover 3D HSIs from 2D measurements becomes one of the challenges. Impressive results have been achieved by deep leaning methods based on convolutional neural networks and transformers, but the directional information is not thoroughly explored to reconstruct HSIs and evaluate the reconstruction quality. In view of this, we propose a two-stage direction-aware spectral-spatial perceptual network (DS\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000PN) for HSI reconstruction. In the first stage, we design a frequency-based preliminary reconstruction subnetwork to roughly recover the global spectral-spatial information of HSIs via frequency interactions. In the second stage, we design a multi-directional spectral-spatial refinement subnetwork to recover the details of HSIs via directional attention mechanisms. To train the whole network, we build a pixel-level reconstruction loss for each subnetwork, and a feature-level multi-directional spectral-spatial perceptual loss which is specially tailored to high-dimensional HSIs. Experimental results show that our DS\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000PN outperforms state-of-the-art methods in quantitative and qualitative evaluation for both simulation and real HSI reconstruction tasks.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1346-1356"},"PeriodicalIF":4.2,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zero-Shot Image Denoising for High-Resolution Electron Microscopy","authors":"Xuanyu Tian;Zhuoya Dong;Xiyue Lin;Yue Gao;Hongjiang Wei;Yanhang Ma;Jingyi Yu;Yuyao Zhang","doi":"10.1109/TCI.2024.3458411","DOIUrl":"10.1109/TCI.2024.3458411","url":null,"abstract":"High-resolution electron microscopy (HREM) imaging technique is a powerful tool for directly visualizing a broad range of materials in real-space. However, it faces challenges in denoising due to ultra-low signal-to-noise ratio (SNR) and scarce data availability. In this work, we propose Noise2SR, a zero-shot self-supervised learning (ZS-SSL) denoising framework for HREM. Within our framework, we propose a super-resolution (SR) based self-supervised training strategy, incorporating the Random Sub-sampler module. The Random Sub-sampler is designed to generate approximate infinite noisy pairs from a single noisy image, serving as an effective data augmentation in zero-shot denoising. Noise2SR trains the network with paired noisy images of different resolutions, which is conducted via SR strategy. The SR-based training facilitates the network adopting more pixels for supervision, and the random sub-sampling helps compel the network to learn continuous signals enhancing the robustness. Meanwhile, we mitigate the uncertainty caused by random-sampling by adopting minimum mean squared error (MMSE) estimation for the denoised results. With the distinctive integration of training strategy and proposed designs, Noise2SR can achieve superior denoising performance using a single noisy HREM image. We evaluate the performance of Noise2SR in both simulated and real HREM denoising tasks. It outperforms state-of-the-art ZS-SSL methods and achieves comparable denoising performance with supervised methods. The success of Noise2SR suggests its potential for improving the SNR of images in material imaging domains.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1462-1475"},"PeriodicalIF":4.2,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DDFSRM: Denoising Diffusion Fusion Model for Line-Scanning Super-Resolution","authors":"Rui Liu;Ying Xiao;Yini Peng;Xin Tian","doi":"10.1109/TCI.2024.3458468","DOIUrl":"10.1109/TCI.2024.3458468","url":null,"abstract":"Line-scanning super-resolution (LSSR) provides a new way to improve the spatial resolution of images. To further improve its super-resolution (SR) performance boosted by deep learning, a new denoising diffusion fusion super-resolution model (DDFSRM) is proposed in this paper. Considering the reconstruction optimization problem in LSSR is ill-posed, we first build a model-based fusion SR guidance and take the diffusion model sampling mean as an implicit prior learned from data to constrain the optimization model, which improves the model's accuracy. Then, the solution of the model is embedded in the iterative process of diffusion sampling. Finally, a posterior sampling model based on the denoising diffusion probabilistic model for LSSR task is obtained to achieve a good balance between denoising and SR capabilities by combining explicit and implicit priors. Both simulated and real experiments show that DDFSRM outperforms other state-of-the-art SR methods in both qualitative and quantitative evaluation.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1357-1367"},"PeriodicalIF":4.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luke Lozenski;Refik Mert Cam;Mark D. Pagel;Mark A. Anastasio;Umberto Villa
{"title":"ProxNF: Neural Field Proximal Training for High-Resolution 4D Dynamic Image Reconstruction","authors":"Luke Lozenski;Refik Mert Cam;Mark D. Pagel;Mark A. Anastasio;Umberto Villa","doi":"10.1109/TCI.2024.3458397","DOIUrl":"10.1109/TCI.2024.3458397","url":null,"abstract":"Accurate spatiotemporal image reconstruction methods are needed for a wide range of biomedical research areas but face challenges due to data incompleteness and computational burden. Data incompleteness arises from the undersampling often required to increase frame rates, while computational burden emerges due to the memory footprint of high-resolution images with three spatial dimensions and extended time horizons. Neural fields (NFs), an emerging class of neural networks that act as continuous representations of spatiotemporal objects, have previously been introduced to solve these dynamic imaging problems by reframing image reconstruction as a problem of estimating network parameters. Neural fields can address the twin challenges of data incompleteness and computational burden by exploiting underlying redundancies in these spatiotemporal objects. This work proposes ProxNF, a novel neural field training approach for spatiotemporal image reconstruction leveraging proximal splitting methods to separate computations involving the imaging operator from updates of the network parameters. Specifically, ProxNF evaluates the (subsampled) gradient of the data-fidelity term in the image domain and uses a fully supervised learning approach to update the neural field parameters. This method is demonstrated in two numerical phantom studies and an in-vivo application to tumor perfusion imaging in small animal models using dynamic contrast-enhanced photoacoustic computed tomography (DCE PACT).","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1368-1383"},"PeriodicalIF":4.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Salman S. Khan;Xiang Yu;Kaushik Mitra;Manmohan Chandraker;Francesco Pittaluga
{"title":"OpEnCam: Lensless Optical Encryption Camera","authors":"Salman S. Khan;Xiang Yu;Kaushik Mitra;Manmohan Chandraker;Francesco Pittaluga","doi":"10.1109/TCI.2024.3451953","DOIUrl":"10.1109/TCI.2024.3451953","url":null,"abstract":"Lensless cameras multiplex the incoming light before it is recorded by the sensor. This ability to multiplex the incoming light has led to the development of ultra-thin, high-speed, and single-shot 3D imagers. Recently, there have been various attempts at demonstrating another useful aspect of lensless cameras - their ability to preserve the privacy of a scene by capturing encrypted measurements. However, existing lensless camera designs suffer numerous inherent privacy vulnerabilities. To demonstrate this, we develop the first comprehensive attack model for encryption cameras, and propose \u0000<sc>OpEnCam</small>\u0000– a novel lensless optical \u0000<bold>en</b>\u0000cryption \u0000<bold>ca</b>\u0000mera design that overcomes these vulnerabilities. \u0000<sc>OpEnCam</small>\u0000 encrypts the incoming light before capturing it using the modulating ability of optical masks. Recovery of the original scene from an \u0000<sc>OpEnCam</small>\u0000 measurement is possible only if one has access to the camera's encryption key, defined by the unique optical elements of each camera. Our \u0000<sc>OpEnCam</small>\u0000 design introduces two major improvements over existing lensless camera designs - (a) the use of two co-axially located optical masks, one stuck to the sensor and the other a few millimeters above the sensor and (b) the design of mask patterns, which are derived heuristically from signal processing ideas. We show, through experiments, that \u0000<sc>OpEnCam</small>\u0000 is robust against a range of attack types while still maintaining the imaging capabilities of existing lensless cameras. We validate the efficacy of \u0000<sc>OpEnCam</small>\u0000 using simulated and real data. Finally, we built and tested a prototype in the lab for proof-of-concept.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1306-1316"},"PeriodicalIF":4.2,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Da Yang;Hao Sheng;Sizhe Wang;Shuai Wang;Zhang Xiong;Wei Ke
{"title":"Boosting Light Field Spatial Super-Resolution via Masked Light Field Modeling","authors":"Da Yang;Hao Sheng;Sizhe Wang;Shuai Wang;Zhang Xiong;Wei Ke","doi":"10.1109/TCI.2024.3451998","DOIUrl":"10.1109/TCI.2024.3451998","url":null,"abstract":"Light field (LF) imaging benefits a wide range of applications with geometry information it captured. However, due to the restricted sensor resolution, LF cameras sacrifice spatial resolution for sufficient angular resolution. Hence LF spatial super-resolution (LFSSR), which highly relies on inter-intra view correlation extraction, is widely studied. In this paper, a self-supervised pre-training scheme, named masked LF modeling (MLFM), is proposed to boost the learning of inter-intra view correlation for better super-resolution performance. To achieve this, we first introduce a transformer structure, termed as LFormer, to establish direct inter-view correlations inside the 4D LF. Compared with traditional disentangling operations for LF feature extraction, LFormer avoids unnecessary loss in angular domain. Therefore it performs better in learning the cross-view mapping among pixels with MLFM pre-training. Then by cascading LFormers as encoder, LFSSR network LFormer-Net is designed, which comprehensively performs inter-intra view high-frequency information extraction. In the end, LFormer-Net is pre-trained with MLFM by introducing a Spatially-Random Angularly-Consistent Masking (SRACM) module. With a high masking ratio, MLFM pre-training effectively promotes the performance of LFormer-Net. Extensive experiments on public datasets demonstrate the effectiveness of MLFM pre-training and LFormer-Net. Our approach outperforms state-of-the-art LFSSR methods numerically and visually on both small- and large-disparity datasets.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1317-1330"},"PeriodicalIF":4.2,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implicit Neural Networks With Fourier-Feature Inputs for Free-Breathing Cardiac MRI Reconstruction","authors":"Johannes F. Kunz;Stefan Ruschke;Reinhard Heckel","doi":"10.1109/TCI.2024.3452008","DOIUrl":"10.1109/TCI.2024.3452008","url":null,"abstract":"Cardiacmagnetic resonance imaging (MRI) requires reconstructing a real-time video of a beating heart from continuous highly under-sampled measurements. This task is challenging since the object to be reconstructed (the heart) is continuously changing during signal acquisition. In this paper, we propose a reconstruction approach based on representing the beating heart with an implicit neural network and fitting the network so that the representation of the heart is consistent with the measurements. The network in the form of a multi-layer perceptron with Fourier-feature inputs acts as an effective signal prior and enables adjusting the regularization strength in both the spatial and temporal dimensions of the signal. We study the proposed approach for 2D free-breathing cardiac real-time MRI in different operating regimes, i.e., for different image resolutions, slice thicknesses, and acquisition lengths. Our method achieves reconstruction quality on par with or slightly better than state-of-the-art untrained convolutional neural networks and superior image quality compared to a recent method that fits an implicit representation directly to k-space measurements. However, this comes at a relatively high computational cost. Our approach does not require any additional patient data or biosensors including electrocardiography, making it potentially applicable in a wide range of clinical scenarios.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1280-1289"},"PeriodicalIF":4.2,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Signal Processing Society Information","authors":"","doi":"10.1109/TCI.2024.3353650","DOIUrl":"https://doi.org/10.1109/TCI.2024.3353650","url":null,"abstract":"","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"C2-C2"},"PeriodicalIF":4.2,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10646370","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Scale Energy (MuSE) Framework for Inverse Problems in Imaging","authors":"Jyothi Rikhab Chand;Mathews Jacob","doi":"10.1109/TCI.2024.3449101","DOIUrl":"https://doi.org/10.1109/TCI.2024.3449101","url":null,"abstract":"We introduce multi-scale energy models to learn the prior distribution of images, which can be used in inverse problems to derive the Maximum A Posteriori (MAP) estimate and to sample from the posterior distribution. Compared to the traditional single-scale energy models, the multi-scale strategy improves the estimation accuracy and convergence of the MAP algorithm, even when it is initialized far away from the solution. We propose two kinds of multi-scale strategies: a) the explicit (e-MuSE) framework, where we use a sequence of explicit energies, each corresponding to a smooth approximation of the original negative log-prior, and b) the implicit (i-MuSE), where we rely on a single energy function whose gradients at different scales closely match the corresponding e-MuSE gradients. Although both schemes improve convergence and accuracy, the e-MuSE MAP solution depends on the scheduling strategy, including the choice of intermediate scales and exit conditions. In contrast, the i-MuSE formulation is significantly simpler, resulting in faster convergence and improved performance. We compare the performance of the proposed MuSE models in the context of Magnetic Resonance (MR) image recovery. The results demonstrate that the multi-scale framework yields a MAP reconstruction comparable in quality to the End-to-End (E2E) trained models, while being relatively unaffected by the changes in the forward model. In addition, the i-MuSE scheme also allows the generation of samples from the posterior distribution, enabling us to estimate the uncertainty maps.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1250-1265"},"PeriodicalIF":4.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142152078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}