{"title":"Parallax Tolerant Light Field Stitching for Hand-held Plenoptic Cameras.","authors":"Xin Jin, Pei Wang, Qionghai Dai","doi":"10.1109/TIP.2019.2945687","DOIUrl":"10.1109/TIP.2019.2945687","url":null,"abstract":"<p><p>Light field (LF) stitching is a potential solution to improve the field of view (FOV) for hand-held plenoptic cameras. Existing LF stitching methods cannot provide accurate registration for scenes with large depth variation. In this paper, a novel LF stitching method is proposed to handle parallax in the LFs more flexibly and accurately. First, a depth layer map (DLM) is proposed to guarantee adequate feature points on each depth layer. For the regions of nondeterministic depth, superpixel layer map (SLM) is proposed based on LF spatial correlation analysis to refine the depth layer assignments. Then, DLM-SLM-based LF registration is proposed to derive the location dependent homography transforms accurately and to warp LFs to its corresponding position without parallax interference. 4D graph-cut is further applied to fuse the registration results for higher LF spatial continuity and angular continuity. Horizontal, vertical and multi-LF stitching are tested for different scenes, which demonstrates the superior performance provided by the proposed method in terms of subjective quality of the stitched LFs, epipolar plane image consistency in the stitched LF, and perspective-averaged correlation between the stitched LF and the input LFs.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62590450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessandro Artusi, Francesco Banterle, Fabio Carrara, Alejandro Moreo
{"title":"Efficient Evaluation of Image Quality via Deep-Learning Approximation of Perceptual Metrics.","authors":"Alessandro Artusi, Francesco Banterle, Fabio Carrara, Alejandro Moreo","doi":"10.1109/TIP.2019.2944079","DOIUrl":"10.1109/TIP.2019.2944079","url":null,"abstract":"<p><p>Image metrics based on Human Visual System (HVS) play a remarkable role in the evaluation of complex image processing algorithms. However, mimicking the HVS is known to be complex and computationally expensive (both in terms of time and memory), and its usage is thus limited to a few applications and to small input data. All of this makes such metrics not fully attractive in real-world scenarios. To address these issues, we propose Deep Image Quality Metric (DIQM), a deep-learning approach to learn the global image quality feature (mean-opinion-score). DIQM can emulate existing visual metrics efficiently, reducing the computational costs by more than an order of magnitude with respect to existing implementations.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting Block-sparsity for Hyperspectral Kronecker Compressive Sensing: a Tensor-based Bayesian Method.","authors":"Rongqiang Zhao, Qiang Wang, Jun Fu, Luquan Ren","doi":"10.1109/TIP.2019.2944722","DOIUrl":"10.1109/TIP.2019.2944722","url":null,"abstract":"<p><p>Bayesian methods are attracting increasing attention in the field of compressive sensing (CS), as they are applicable to recover signals from random measurements. However, these methods have limited use in many tensor-based cases such as hyperspectral Kronecker compressive sensing (HKCS), because they exploit the sparsity in only one dimension. In this paper, we propose a novel Bayesian model for HKCS in an attempt to overcome the above limitation. The model exploits multi-dimensional block-sparsity such that the information redundancies in all dimensions are eliminated. Laplace prior distributions are employed for sparse coefficients in each dimension, and their coupling is consistent with the multi-dimensional block-sparsity model. Based on the proposed model, we develop a tensor-based Bayesian reconstruction algorithm, which decouples the hyperparameters for each dimension via a low-complexity technique. Experimental results demonstrate that the proposed method is able to provide more accurate reconstruction than existing Bayesian methods at a satisfactory speed. Additionally, the proposed method can not only be used for HKCS, it also has the potential to be extended to other multi-dimensional CS applications and to multi-dimensional block-sparse-based data recovery.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62590166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiuxiu Bai, Lele Ye, Jihua Zhu, Li Zhu, Taku Komura
{"title":"Skeleton Filter: A Self-Symmetric Filter for Skeletonization in Noisy Text Images.","authors":"Xiuxiu Bai, Lele Ye, Jihua Zhu, Li Zhu, Taku Komura","doi":"10.1109/TIP.2019.2944560","DOIUrl":"10.1109/TIP.2019.2944560","url":null,"abstract":"<p><p>Robustly computing the skeletons of objects in natural images is difficult due to the large variations in shape boundaries and the large amount of noise in the images. Inspired by recent findings in neuroscience, we propose the Skeleton Filter, which is a novel model for skeleton extraction from natural images. The Skeleton Filter consists of a pair of oppositely oriented Gabor-like filters; by applying the Skeleton Filter in various orientations to an image at multiple resolutions and fusing the results, our system can robustly extract the skeleton even under highly noisy conditions. We evaluate the performance of our approach using challenging noisy text datasets and demonstrate that our pipeline realizes state-of-the-art performance for extracting the text skeleton. Moreover, the presence of Gabor filters in the human visual system and the simple architecture of the Skeleton Filter can help explain the strong capabilities of humans in perceiving skeletons of objects, even under dramatically noisy conditions.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Coupled ISTA Network for Multi-modal Image Super-Resolution.","authors":"Xin Deng, Pier Luigi Dragotti","doi":"10.1109/TIP.2019.2944270","DOIUrl":"10.1109/TIP.2019.2944270","url":null,"abstract":"<p><p>Given a low-resolution (LR) image, multi-modal image super-resolution (MISR) aims to find the high-resolution (HR) version of this image with the guidance of an HR image from another modality. In this paper, we use a model-based approach to design a new deep network architecture for MISR. We first introduce a novel joint multi-modal dictionary learning (JMDL) algorithm to model cross-modality dependency. In JMDL, we simultaneously learn three dictionaries and two transform matrices to combine the modalities. Then, by unfolding the iterative shrinkage and thresholding algorithm (ISTA), we turn the JMDL model into a deep neural network, called deep coupled ISTA network. Since the network initialization plays an important role in deep network training, we further propose a layer-wise optimization algorithm (LOA) to initialize the parameters of the network before running back-propagation strategy. Specifically, we model the network initialization as a multi-layer dictionary learning problem, and solve it through convex optimization. The proposed LOA is demonstrated to effectively decrease the training loss and increase the reconstruction accuracy. Finally, we compare our method with other state-of-the-art methods in the MISR task. The numerical results show that our method consistently outperforms others both quantitatively and qualitatively at different upscaling factors for various multi-modal scenarios.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Si Wu, Wenhao Wu, Shiyao Lei, Sihao Lin, Rui Li, Zhiwen Yu, Hau-San Wong
{"title":"Semi-Supervised Human Detection via Region Proposal Networks Aided by Verification.","authors":"Si Wu, Wenhao Wu, Shiyao Lei, Sihao Lin, Rui Li, Zhiwen Yu, Hau-San Wong","doi":"10.1109/TIP.2019.2944306","DOIUrl":"10.1109/TIP.2019.2944306","url":null,"abstract":"<p><p>In this paper, we explore how to leverage readily available unlabeled data to improve semi-supervised human detection performance. For this purpose, we specifically modify the region proposal network (RPN) for learning on a partially labeled dataset. Based on commonly observed false positive types, a verification module is developed to assess foreground human objects in the candidate regions to provide an important cue for filtering the RPN's proposals. The remaining proposals with high confidence scores are then used as pseudo annotations for re-training our detection model. To reduce the risk of error propagation in the training process, we adopt a self-paced training strategy to progressively include more pseudo annotations generated by the previous model over multiple training rounds. The resulting detector re-trained on the augmented data can be expected to have better detection performance. The effectiveness of the main components of this framework is verified through extensive experiments, and the proposed approach achieves state-of-the-art detection results on multiple scene-specific human detection benchmarks in the semi-supervised setting.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62589764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Interleaved Cascade of Shrinkage Fields for Joint Image Dehazing and Denoising.","authors":"Qingbo Wu, Wenqi Ren, Xiaochun Cao","doi":"10.1109/TIP.2019.2942504","DOIUrl":"10.1109/TIP.2019.2942504","url":null,"abstract":"<p><p>Most existing image dehazing methods deteriorate to different extents when processing hazy inputs with noise. The main reason is that the commonly adopted two-step strategy tends to amplify noise in the inverse operation of division by the transmission. To address this problem, we learn an interleaved Cascade of Shrinkage Fields (CSF) to reduce noise in jointly recovering the transmission map and the scene radiance from a single hazy image. Specifically, an auxiliary shrinkage field (SF) model is integrated into each cascade of the proposed scheme to reduce undesirable artifacts during the transmission estimation. Different from conventional CSF, our learned SF models have special visual patterns, which facilitate the specific task of noise reduction in haze removal. Furthermore, a numerical algorithm is proposed to efficiently update the scene radiance and the transmission map in each cascade. Extensive experiments on synthetic and real-world data demonstrate that the proposed algorithm performs favorably against state-of-the-art dehazing methods on hazy and noisy images.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62588486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Biological Vision Inspired Framework for Image Enhancement in Poor Visibility Conditions.","authors":"Kai-Fu Yang, Xian-Shi Zhang, Yong-Jie Li","doi":"10.1109/TIP.2019.2938310","DOIUrl":"10.1109/TIP.2019.2938310","url":null,"abstract":"<p><p>Image enhancement is an important pre-processing step for many computer vision applications especially regarding the scenes in poor visibility conditions. In this work, we develop a unified two-pathway model inspired by the biological vision, especially the early visual mechanisms, which contributes to image enhancement tasks including low dynamic range (LDR) image enhancement and high dynamic range (HDR) image tone mapping. Firstly, the input image is separated and sent into two visual pathways: structure-pathway and detail-pathway, corresponding to the M-and P-pathway in the early visual system, which code the low-and high-frequency visual information, respectively. In the structure-pathway, an extended biological normalization model is used to integrate the global and local luminance adaptation, which can handle the visual scenes with varying illuminations. On the other hand, the detail enhancement and local noise suppression are achieved in the detail-pathway based on local energy weighting. Finally, the outputs of structure-and detail-pathway are integrated to achieve the low-light image enhancement. In addition, the proposed model can also be used for tone mapping of HDR images with some fine-tuning steps. Extensive experiments on three datasets (two LDR image datasets and one HDR scene dataset) show that the proposed model can handle the visual enhancement tasks mentioned above efficiently and outperform the related state-of-the-art methods.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Receptive Field Size vs. Model Depth for Single Image Super-resolution.","authors":"Ruxin Wang, Mingming Gong, Dacheng Tao","doi":"10.1109/TIP.2019.2941327","DOIUrl":"10.1109/TIP.2019.2941327","url":null,"abstract":"<p><p>The performance of single image super-resolution (SISR) has been largely improved by innovative designs of deep architectures. An important claim raised by these designs is that the deep models have large receptive field size and strong nonlinearity. However, we are concerned about the question that which factor, receptive field size or model depth, is more critical for SISR. Towards revealing the answers, in this paper, we propose a strategy based on dilated convolution to investigate how the two factors affect the performance of SISR. Our findings from exhaustive investigations suggest that SISR is more sensitive to the changes of receptive field size than to the model depth variations, and that the model depth must be congruent with the receptive field size to produce improved performance. These findings inspire us to design a shallower architecture which can save computational and memory cost while preserving comparable effectiveness with respect to a much deeper architecture.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62587513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhuo Chen, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, Alex C Kot
{"title":"Intermediate Deep Feature Compression: Toward Intelligent Sensing.","authors":"Zhuo Chen, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, Alex C Kot","doi":"10.1109/TIP.2019.2941660","DOIUrl":"10.1109/TIP.2019.2941660","url":null,"abstract":"<p><p>The recent advances of hardware technology have made the intelligent analysis equipped at the front-end with deep learning more prevailing and practical. To better enable the intelligent sensing at the front-end, instead of compressing and transmitting visual signals or the ultimately utilized top-layer deep learning features, we propose to compactly represent and convey the intermediate-layer deep learning features with high generalization capability, to facilitate the collaborating approach between front and cloud ends. This strategy enables a good balance among the computational load, transmission load and the generalization ability for cloud servers when deploying the deep neural networks for large scale cloud based visual analysis. Moreover, the presented strategy also makes the standardization of deep feature coding more feasible and promising, as a series of tasks can simultaneously benefit from the transmitted intermediate layer features. We also present the results for evaluations of both lossless and lossy deep feature compression, which provide meaningful investigations and baselines for future research and standardization activities.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62587991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}