Da Yang;Hao Sheng;Sizhe Wang;Shuai Wang;Zhang Xiong;Wei Ke
{"title":"Boosting Light Field Spatial Super-Resolution via Masked Light Field Modeling","authors":"Da Yang;Hao Sheng;Sizhe Wang;Shuai Wang;Zhang Xiong;Wei Ke","doi":"10.1109/TCI.2024.3451998","DOIUrl":"10.1109/TCI.2024.3451998","url":null,"abstract":"Light field (LF) imaging benefits a wide range of applications with geometry information it captured. However, due to the restricted sensor resolution, LF cameras sacrifice spatial resolution for sufficient angular resolution. Hence LF spatial super-resolution (LFSSR), which highly relies on inter-intra view correlation extraction, is widely studied. In this paper, a self-supervised pre-training scheme, named masked LF modeling (MLFM), is proposed to boost the learning of inter-intra view correlation for better super-resolution performance. To achieve this, we first introduce a transformer structure, termed as LFormer, to establish direct inter-view correlations inside the 4D LF. Compared with traditional disentangling operations for LF feature extraction, LFormer avoids unnecessary loss in angular domain. Therefore it performs better in learning the cross-view mapping among pixels with MLFM pre-training. Then by cascading LFormers as encoder, LFSSR network LFormer-Net is designed, which comprehensively performs inter-intra view high-frequency information extraction. In the end, LFormer-Net is pre-trained with MLFM by introducing a Spatially-Random Angularly-Consistent Masking (SRACM) module. With a high masking ratio, MLFM pre-training effectively promotes the performance of LFormer-Net. Extensive experiments on public datasets demonstrate the effectiveness of MLFM pre-training and LFormer-Net. Our approach outperforms state-of-the-art LFSSR methods numerically and visually on both small- and large-disparity datasets.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1317-1330"},"PeriodicalIF":4.2,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implicit Neural Networks With Fourier-Feature Inputs for Free-Breathing Cardiac MRI Reconstruction","authors":"Johannes F. Kunz;Stefan Ruschke;Reinhard Heckel","doi":"10.1109/TCI.2024.3452008","DOIUrl":"10.1109/TCI.2024.3452008","url":null,"abstract":"Cardiacmagnetic resonance imaging (MRI) requires reconstructing a real-time video of a beating heart from continuous highly under-sampled measurements. This task is challenging since the object to be reconstructed (the heart) is continuously changing during signal acquisition. In this paper, we propose a reconstruction approach based on representing the beating heart with an implicit neural network and fitting the network so that the representation of the heart is consistent with the measurements. The network in the form of a multi-layer perceptron with Fourier-feature inputs acts as an effective signal prior and enables adjusting the regularization strength in both the spatial and temporal dimensions of the signal. We study the proposed approach for 2D free-breathing cardiac real-time MRI in different operating regimes, i.e., for different image resolutions, slice thicknesses, and acquisition lengths. Our method achieves reconstruction quality on par with or slightly better than state-of-the-art untrained convolutional neural networks and superior image quality compared to a recent method that fits an implicit representation directly to k-space measurements. However, this comes at a relatively high computational cost. Our approach does not require any additional patient data or biosensors including electrocardiography, making it potentially applicable in a wide range of clinical scenarios.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1280-1289"},"PeriodicalIF":4.2,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Signal Processing Society Information","authors":"","doi":"10.1109/TCI.2024.3353650","DOIUrl":"https://doi.org/10.1109/TCI.2024.3353650","url":null,"abstract":"","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"C2-C2"},"PeriodicalIF":4.2,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10646370","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Scale Energy (MuSE) Framework for Inverse Problems in Imaging","authors":"Jyothi Rikhab Chand;Mathews Jacob","doi":"10.1109/TCI.2024.3449101","DOIUrl":"https://doi.org/10.1109/TCI.2024.3449101","url":null,"abstract":"We introduce multi-scale energy models to learn the prior distribution of images, which can be used in inverse problems to derive the Maximum A Posteriori (MAP) estimate and to sample from the posterior distribution. Compared to the traditional single-scale energy models, the multi-scale strategy improves the estimation accuracy and convergence of the MAP algorithm, even when it is initialized far away from the solution. We propose two kinds of multi-scale strategies: a) the explicit (e-MuSE) framework, where we use a sequence of explicit energies, each corresponding to a smooth approximation of the original negative log-prior, and b) the implicit (i-MuSE), where we rely on a single energy function whose gradients at different scales closely match the corresponding e-MuSE gradients. Although both schemes improve convergence and accuracy, the e-MuSE MAP solution depends on the scheduling strategy, including the choice of intermediate scales and exit conditions. In contrast, the i-MuSE formulation is significantly simpler, resulting in faster convergence and improved performance. We compare the performance of the proposed MuSE models in the context of Magnetic Resonance (MR) image recovery. The results demonstrate that the multi-scale framework yields a MAP reconstruction comparable in quality to the End-to-End (E2E) trained models, while being relatively unaffected by the changes in the forward model. In addition, the i-MuSE scheme also allows the generation of samples from the posterior distribution, enabling us to estimate the uncertainty maps.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1250-1265"},"PeriodicalIF":4.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142152078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu Sun;Zihui Wu;Yifan Chen;Berthy T. Feng;Katherine L. Bouman
{"title":"Provable Probabilistic Imaging Using Score-Based Generative Priors","authors":"Yu Sun;Zihui Wu;Yifan Chen;Berthy T. Feng;Katherine L. Bouman","doi":"10.1109/TCI.2024.3449114","DOIUrl":"10.1109/TCI.2024.3449114","url":null,"abstract":"Estimating high-quality images while also quantifying their uncertainty are two desired features in an image reconstruction algorithm for solving ill-posed inverse problems. In this paper, we propose \u0000<italic>plug-and-play Monte Carlo (PMC)</i>\u0000 as a principled framework for characterizing the space of possible solutions to a general inverse problem. PMC is able to incorporate expressive score-based generative priors for high-quality image reconstruction while also performing uncertainty quantification via posterior sampling. In particular, we develop two PMC algorithms that can be viewed as the sampling analogues of the traditional plug-and-play priors (PnP) and regularization by denoising (RED) algorithms. To improve the sampling efficiency, we introduce weighted annealing into these PMC algorithms, further developing two additional annealed PMC algorithms (APMC). We establish a theoretical analysis for characterizing the convergence behavior of PMC algorithms. Our analysis provides non-asymptotic stationarity guarantees in terms of the Fisher information, fully compatible with the joint presence of weighted annealing, potentially non-log-concave likelihoods, and imperfect score networks. We demonstrate the performance of the PMC algorithms on multiple representative inverse problems with both linear and nonlinear forward models. Experimental results show that PMC significantly improves reconstruction quality and enables high-fidelity uncertainty quantification.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1290-1305"},"PeriodicalIF":4.2,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Reconstruction and Spatial Super-Resolution of Hyper-Spectral CTIS Images via Multi-Scale Refinement","authors":"Mazen Mel;Alexander Gatto;Pietro Zanuttigh","doi":"10.1109/TCI.2024.3446230","DOIUrl":"10.1109/TCI.2024.3446230","url":null,"abstract":"The Computed Tomography Imaging Spectrometer (CTIS) is a snapshot imaging device that captures Hyper-Spectral images as two-dimensional compressed sensor measurements. Computational post-processing algorithms are later needed to recover the latent object cube. However, iterative algorithms typically used to solve this task require large computational resources and, furthermore, these approaches are very sensitive to the presumed system and noise models. In addition, the poor spatial resolution of the \u0000<inline-formula><tex-math>$0$</tex-math></inline-formula>\u0000th diffraction order image limits the usability of CTIS in favor of other snapshot spectrometers even though it enables higher spectral resolution. In this paper we introduce a learning-based computational model exploiting a reconstruction network with iterative refinement, that is able to recover high quality hyper-spectral images leveraging complementary spatio-spectral information scattered across the CTIS sensor image. We showcase the reconstruction capability of such model beyond the spatial resolution limit of the \u0000<inline-formula><tex-math>$0$</tex-math></inline-formula>\u0000th diffraction order image. Experimental results are shown both on synthetic data and on real datasets that we acquired using two different CTIS systems coupled with high spatial resolution ground truth hyper-spectral images. Furthermore, we introduce HSIRS, the largest dataset of its kind for joint spectral image reconstruction and semantic segmentation of food items with high quality manually annotated segmentation maps and we showcase how hyper-spectral data allows to efficiently tackle this task.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1449-1461"},"PeriodicalIF":4.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10640287","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Lu;Jiandong Tian;Yiming Su;Yidong Luo;Junchao Zhang;Chunhui Hao
{"title":"A Hybrid Polarization Image Demosaicking Algorithm Based on Inter-Channel Correlation","authors":"Yang Lu;Jiandong Tian;Yiming Su;Yidong Luo;Junchao Zhang;Chunhui Hao","doi":"10.1109/TCI.2024.3443728","DOIUrl":"10.1109/TCI.2024.3443728","url":null,"abstract":"Emerging \u0000<italic>monochrome and chromatic polarization filter array</i>\u0000 (MPFA and CPFA) cameras require polarization demosaicking to obtain accurate polarization parameters. Polarization cameras sample the polarization intensity at each location of the pixels. A captured raw image must be converted to a full-channel polarization intensity image using the \u0000<italic>polarization demosaicking method</i>\u0000 (PDM). However, due to sparse sampling between polarization channels, implementing MPFA and CPFA demosaicking has been challenging. This paper proposes a new hybrid polarization demosaicking algorithm that leverages polarization confidence-based refinement to exploit inter-channel polarization correlation. Additionally, we enhance texture correlation to utilize inter-channel texture correlation fully. Our three-stage PDM preserves both the polarization and texture information. We also introduce a metric computation method to handle the \u0000<inline-formula><tex-math>$pi$</tex-math></inline-formula>\u0000-ambiguity of the \u0000<italic>angle of line polarization</i>\u0000 (AoLP). This approach mitigates inaccuracies and \u0000<inline-formula><tex-math>$pi$</tex-math></inline-formula>\u0000-ambiguity in existing methods when describing the quality of AoLP reconstruction. We extensively compare and conduct ablation experiments on synthetic datasets from MPFA and CPFA. Our method achieves competitive results compared to other state-of-the-art methods. Furthermore, we evaluate our proposal on real-world datasets to demonstrate its applicability in real-world, variable scenarios. Two application experiments (road detection and shape from polarization) show that our proposal can be applied to real-world applications.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1400-1413"},"PeriodicalIF":4.2,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decoupling Image Deblurring Into Twofold: A Hierarchical Model for Defocus Deblurring","authors":"Pengwei Liang;Junjun Jiang;Xianming Liu;Jiayi Ma","doi":"10.1109/TCI.2024.3443732","DOIUrl":"https://doi.org/10.1109/TCI.2024.3443732","url":null,"abstract":"Defocus deblurring, especially when facing spatially varying blur due to scene depth, remains a challenging problem. While recent advancements in network architectures have predominantly addressed high-frequency details, the importance of scene understanding for deblurring remains paramount. A crucial aspect of this understanding is \u0000<italic>contextual information</i>\u0000, which captures vital high-level semantic cues essential for grasping the context and object outlines. Recognizing and effectively capitalizing on these cues can lead to substantial improvements in image recovery. With this foundation, we propose a novel method that integrates spatial details and contextual information, offering significant advancements in defocus deblurring. Consequently, we introduce a novel hierarchical model, built upon the capabilities of the Vision Transformer (ViT). This model seamlessly encodes both spatial details and contextual information, yielding a robust solution. In particular, our approach decouples the complex deblurring task into two distinct subtasks. The first is handled by a primary feature encoder that transforms blurred images into detailed representations. The second involves a contextual encoder that produces abstract and sharp representations from the primary ones. The combined outputs from these encoders are then merged by a decoder to reproduce the sharp target image. Our evaluation across multiple defocus deblurring datasets demonstrates that the proposed method achieves compelling performance.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1207-1220"},"PeriodicalIF":4.2,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Beuret;Adrien Besson;Akihiro Sugimoto;Jean-Philippe Thiran
{"title":"An Inverse-Problem Approach to the Estimation of Despeckled and Deconvolved Images From Radio-Frequency Signals in Pulse-Echo Ultrasound","authors":"Samuel Beuret;Adrien Besson;Akihiro Sugimoto;Jean-Philippe Thiran","doi":"10.1109/TCI.2024.3441234","DOIUrl":"https://doi.org/10.1109/TCI.2024.3441234","url":null,"abstract":"In recent years, there has been notable progress in the development of inverse problems for image reconstruction in pulse-echo ultrasound. Inverse problems are designed to circumvent the restrictions of delay-and-sum, such as limited image resolution and diffraction artifacts, especially when low amount of data are considered. However, the radio-frequency image or tissue reflectivity function that current inverse problems seek to estimate do not possess a structure that can be easily leveraged by a regularizer, in part due to their high dynamic range. The performance of inverse-problem image reconstruction is thus impeded. In contrast, despeckled images exhibit a more exploitable structure. Therefore, we first propose an inverse problem to recover a despeckled image from single-plane-wave radio-frequency echo signals, employing total-variation norm regularization. Then, we introduce an inverse problem to estimate the tissue reflectivity function from radio-frequency echo signals, factoring in the despeckled image obtained by the first problem into a spatially-varying reflectivity prior. We show with simulated, in-vitro, and in-vivo data that the proposed despeckled image estimation technique recovers images almost free of diffraction artifacts and improves contrast with respect to delay-and-sum and non-local means despeckling. Moreover, we show with in-vitro and in-vivo data that the proposed reflectivity estimation method reduces artifacts and improves contrast with respect to a state-of-the-art inverse problem positing a uniform prior. In particular, the proposed techniques could prove beneficial for imaging with ultra-portable transducers, since these devices are likely to be limited in the amount of data they can acquire and transmit.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1191-1206"},"PeriodicalIF":4.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10634308","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142099726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Piecewise Planar Representation for RGB Guided Depth Super-Resolution","authors":"Ruikang Xu;Mingde Yao;Yuanshen Guan;Zhiwei Xiong","doi":"10.1109/TCI.2024.3439990","DOIUrl":"10.1109/TCI.2024.3439990","url":null,"abstract":"RGB guided depth super-resolution (GDSR) aims to reconstruct high-resolution (HR) depth images from low-resolution ones using HR RGB images as guidance, overcoming the resolution limitation of depth cameras. The main challenge in this task is how to effectively explore the HR information from RGB images while avoiding texture being over-transferred. To address this challenge, we propose a novel method for GSDR based on the piecewise planar representation in the 3D space, which naturally focuses on the geometry information of scenes without concerning the internal textures. Specifically, we design a plane-aware interaction module to effectively bridge the RGB and depth modalities and perform information interaction by taking piecewise planes as the intermediary. We also devise a plane-guided fusion module to further remove modality-inconsistent information. To mitigate the distribution gap between synthetic and real-world data, we propose a self-training adaptation strategy for the real-world deployment of our method. Comprehensive experimental results on multiple representative datasets demonstrate the superiority of our method over existing state-of-the-art GDSR methods.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1266-1279"},"PeriodicalIF":4.2,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141941880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}