{"title":"Color Control Functions for Multiprimary Displays I: Robustness Analysis and Optimization Formulations.","authors":"Carlos Eduardo Rodriguez-Pardo, Gaurav Sharma","doi":"10.1109/TIP.2019.2937067","DOIUrl":"10.1109/TIP.2019.2937067","url":null,"abstract":"<p><p>Color management for a multiprimary display requires, as a fundamental step, the determination of a color control function (CCF) that specifies control values for reproducing each color in the display's gamut. Multiprimary displays offer alternative choices of control values for reproducing a color in the interior of the gamut and accordingly alternative choices of CCFs. Under ideal conditions, alternative CCFs render colors identically. However, deviations in the spectral distributions of the primaries and the diversity of cone sensitivities among observers impact alternative CCFs differently, and, in particular, make some CCFs prone to artifacts in rendered images. We develop a framework for analyzing robustness of CCFs for multiprimary displays against primary and observer variations, incorporating a common model of human color perception. Using the framework, we propose analytical and numerical approaches for determining robust CCFs. First, via analytical development, we: (a) demonstrate that linearity of the CCF in tristimulus space endows it with resilience to variations, particularly, linearity can ensure invariance of the gray axis, (b) construct an axially linear CCF that is defined by the property of linearity over constant chromaticity loci, and (c) obtain an analytical form for the axially linear CCF that demonstrates it is continuous but suffers from the limitation that it does not have continuous derivatives. Second, to overcome the limitation of the axially linear CCF, we motivate and develop two variational objective functions for optimization of multiprimary CCFs, the first aims to preserve color transitions in the presence of primary/observer variations and the second combines this objective with desirable invariance along the gray axis, by incorporating the axially linear CCF. A companion Part II paper, presents an algorithmic approach for numerically computing optimal CCFs for the two alternative variational objective functions proposed here and presents results comparing alternative CCFs for several different 4,5, and 6 primary designs.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Amestoy, Alexandre Mercat, Wassim Hamidouche, Daniel Menard, Cyril Bergeron
{"title":"Tunable VVC Frame Partitioning based on Lightweight Machine Learning.","authors":"Thomas Amestoy, Alexandre Mercat, Wassim Hamidouche, Daniel Menard, Cyril Bergeron","doi":"10.1109/TIP.2019.2938670","DOIUrl":"10.1109/TIP.2019.2938670","url":null,"abstract":"<p><p>Block partition structure is a critical module in video coding scheme to achieve significant gap of compression performance. Under the exploration of the future video coding standard, named Versatile Video Coding (VVC), a new Quad Tree Binary Tree (QTBT) block partition structure has been introduced. In addition to the QT block partitioning defined in High Efficiency Video Coding (HEVC) standard, new horizontal and vertical BT partitions are enabled, which drastically increases the encoding time compared to HEVC. In this paper, we propose a lightweight and tunable QTBT partitioning scheme based on a Machine Learning (ML) approach. The proposed solution uses Random Forest classifiers to determine for each coding block the most probable partition modes. To minimize the encoding loss induced by misclassification, risk intervals for classifier decisions are introduced in the proposed solution. By varying the size of risk intervals, tunable trade-off between encoding complexity reduction and coding loss is achieved. The proposed solution implemented in the JEM-7.0 software offers encoding complexity reductions ranging from 30average for only 0.7% to 3.0% Bjxntegaard Delta Rate (BDBR) increase in Random Access (RA) coding configuration, with very slight overhead induced by Random Forest. The proposed solution based on Random Forest classifiers is also efficient to reduce the complexity of the Multi-Type Tree (MTT) partitioning scheme under the VTM-5.0 software, with complexity reductions ranging from 25% to 61% in average for only 0.4% to 2.2% BD-BR increase.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting Related and Unrelated Tasks for Hierarchical Metric Learning and Image Classification.","authors":"Yu Zheng, Jianping Fan, Ji Zhang, Xinbo Gao","doi":"10.1109/TIP.2019.2938321","DOIUrl":"10.1109/TIP.2019.2938321","url":null,"abstract":"<p><p>In multi-task learning, multiple interrelated tasks are jointly learned to achieve better performance. In many cases, if we can identify which tasks are related, we can also clearly identify which tasks are unrelated. In the past, most researchers emphasized exploiting correlations among interrelated tasks while completely ignoring the unrelated tasks that may provide valuable prior knowledge for multi-task learning. In this paper, a new approach is developed to hierarchically learn a tree of multi-task metrics by leveraging prior knowledge about both the related tasks and unrelated tasks. First, a visual tree is constructed to hierarchically organize large numbers of image categories in a coarse-to-fine fashion. Over the visual tree, a multi-task metric classifier is learned for each node by exploiting both the related and unrelated tasks, where the learning tasks for training the classifiers for the sibling child nodes under the same parent node are treated as the interrelated tasks, and the others are treated as the unrelated tasks. In addition, the node-specific metric for the parent node is propagated to its sibling child nodes to control inter-level error propagation. Our experimental results demonstrate that our hierarchical metric learning algorithm achieves better results than other state-of-the-art algorithms.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Zhang, Ji Liu, Bob Zhanga, David Zhangb, Ce Zhu
{"title":"Deep Cascade Model based Face Recognition: When Deep-layered Learning Meets Small Data.","authors":"Lei Zhang, Ji Liu, Bob Zhanga, David Zhangb, Ce Zhu","doi":"10.1109/TIP.2019.2938307","DOIUrl":"10.1109/TIP.2019.2938307","url":null,"abstract":"<p><p>Sparse representation based classification (SRC), nuclear-norm matrix regression (NMR), and deep learning (DL) have achieved a great success in face recognition (FR). However, there still exist some intrinsic limitations among them. SRC and NMR based coding methods belong to one-step model, such that the latent discriminative information of the coding error vector cannot be fully exploited. DL, as a multi-step model, can learn powerful representation, but relies on large-scale data and computation resources for numerous parameters training with complicated back-propagation. Straightforward training of deep neural networks from scratch on small-scale data is almost infeasible. Therefore, in order to develop efficient algorithms that are specifically adapted for small-scale data, we propose to derive the deep models of SRC and NMR. Specifically, in this paper, we propose an end-to-end deep cascade model (DCM) based on SRC and NMR with hierarchical learning, nonlinear transformation and multi-layer structure for corrupted face recognition. The contributions include four aspects. First, an end-to-end deep cascade model for small-scale data without back-propagation is proposed. Second, a multi-level pyramid structure is integrated for local feature representation. Third, for introducing nonlinear transformation in layer-wise learning, softmax vector coding of the errors with class discrimination is proposed. Fourth, the existing representation methods can be easily integrated into our DCM framework. Experiments on a number of small-scale benchmark FR datasets demonstrate the superiority of the proposed model over state-of-the-art counterparts. Additionally, a perspective that deep-layered learning does not have to be convolutional neural network with back-propagation optimization is consolidated. The demo code is available in https://github.com/liuji93/DCM.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple Cycle-in-Cycle Generative Adversarial Networks for Unsupervised Image Super-Resolution.","authors":"Yongbing Zhang, Siyuan Liu, Chao Dong, Xinfeng Zhang, Yuan Yuan","doi":"10.1109/TIP.2019.2938347","DOIUrl":"10.1109/TIP.2019.2938347","url":null,"abstract":"<p><p>With the help of convolutional neural networks (CNN), the single image super-resolution problem has been widely studied. Most of these CNN based methods focus on learning a model to map a low-resolution (LR) image to a highresolution (HR) image, where the LR image is downsampled from the HR image with a known model. However, in a more general case when the process of the down-sampling is unknown and the LR input is degraded by noises and blurring, it is difficult to acquire the LR and HR image pairs for traditional supervised learning. Inspired by the recent unsupervised imagestyle translation applications using unpaired data, we propose a multiple Cycle-in-Cycle network structure to deal with the more general case using multiple generative adversarial networks (GAN) as the basis components. The first network cycle aims at mapping the noisy and blurry LR input to a noise-free LR space, then a new cycle with a well-trained ×2 network model is orderly introduced to super-resolve the intermediate output of the former cycle. The number of total cycles depends on the different up-sampling factors (×2, ×4, ×8). Finally, all modules are trained in an end-to-end manner to get the desired HR output. Quantitative indexes and qualitative results show that our proposed method achieves comparable performance with the state-of-the-art supervised models.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingwen Ye, Yongcheng Jing, Xinchao Wang, Kairi Ou, Dacheng Tao, Mingli Song
{"title":"Edge-Sensitive Human Cutout with Hierarchical Granularity and Loopy Matting Guidance.","authors":"Jingwen Ye, Yongcheng Jing, Xinchao Wang, Kairi Ou, Dacheng Tao, Mingli Song","doi":"10.1109/TIP.2019.2930146","DOIUrl":"10.1109/TIP.2019.2930146","url":null,"abstract":"<p><p>Human parsing and matting play important roles in various applications, such as dress collocation, clothing recommendation, and image editing. In this paper, we propose a lightweight hybrid model that unifies the fully-supervised hierarchical-granularity parsing task and the unsupervised matting one. Our model comprises two parts, the extensible hierarchical semantic segmentation block using CNN and the matting module composed of guided filters. Given a human image, the segmentation block stage-1 first obtains a primitive segmentation map to separate the human and the background. The primitive segmentation is then fed into stage-2 together with the original image to give a rough segmentation of human body. This procedure is repeated in the stage-3 to acquire a refined segmentation. The matting module takes as input the above estimated segmentation maps and produces the matting map, in a fully unsupervised manner. The obtained matting map is then in turn fed back to the CNN in the first block for refining the semantic segmentation results.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62583550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-View Video Synopsis via Simultaneous Object-Shifting and View-Switching Optimization.","authors":"Zhensong Zhang, Yongwei Nie, Hanqiu Sun, Qing Zhang, Qiuxia Lai, Guiqing Li, Mingyu Xiao","doi":"10.1109/TIP.2019.2938086","DOIUrl":"10.1109/TIP.2019.2938086","url":null,"abstract":"<p><p>We present a method for synopsizing multiple videos captured by a set of surveillance cameras with some overlapped field-of-views. Currently, object-based approaches that directly shift objects along the time axis are already able to compute compact synopsis results for multiple surveillance videos. The challenge is how to present the multiple synopsis results in a more compact and understandable way. Previous approaches show them side by side on the screen, which however is difficult for user to comprehend. In this paper, we solve the problem by joint object-shifting and camera view-switching. Firstly, we synchronize the input videos, and group the same object in different videos together. Then we shift the groups of objects along the time axis to obtain multiple synopsis videos. Instead of showing them simultaneously, we just show one of them at each time, and allow to switch among the views of different synopsis videos. In this view switching way, we obtain just a single synopsis results consisting of content from all the input videos, which is much easier for user to follow and understand. To obtain the best synopsis result, we construct a simultaneous object-shifting and view-switching optimization framework instead of solving them separately. We also present an alternative optimization strategy composed of graph cuts and dynamic programming to solve the unified optimization. Experiments demonstrate that our single synopsis video generated from multiple input videos is compact, complete, and easy to understand.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Seismic Image Interpolation with Mathematical Morphological Constraint.","authors":"Weilin Huang, Jianxin Liu","doi":"10.1109/TIP.2019.2936744","DOIUrl":"10.1109/TIP.2019.2936744","url":null,"abstract":"<p><p>Seismic image interpolation is a currently popular research subject in modern reflection seismology. The interpolation problem is generally treated as a process of inversion. Under the compressed sensing framework, various sparse transformations and low-rank constraints based methods have great performances in recovering irregularly missing traces. However, in the case of regularly missing traces, their applications are limited because of the strong spatial aliasing energies. In addition, the erratic noise always poses a serious impact on the interpolation results obtained by the sparse transformations and low-rank constraints-based methods. This is because the erratic noise is far from satisfying the statistical assumption behind these methods. In this study, we propose a mathematical morphology-based interpolation technique, which constrains the morphological scale of the model in the inversion process. The inversion problem is solved by the shaping regularization approach. The mathematical morphological constraint (MMC)-based interpolation technique has a satisfactory robustness to the spatial aliasing and erratic energies. We provide a detailed algorithmic framework and discuss the extension from 2D to higher dimensional version and the back operator in the shaping inversion. A group of numerical examples demonstrates the successful performance of the proposed technique.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Perez-Ortiz, Aliaksei Mikhailiuk, Emin Zerman, Vedad Hulusic, Giuseppe Valenzise, Rafal K Mantiuk
{"title":"From pairwise comparisons and rating to a unified quality scale.","authors":"Maria Perez-Ortiz, Aliaksei Mikhailiuk, Emin Zerman, Vedad Hulusic, Giuseppe Valenzise, Rafal K Mantiuk","doi":"10.1109/TIP.2019.2936103","DOIUrl":"10.1109/TIP.2019.2936103","url":null,"abstract":"<p><p>The goal of psychometric scaling is the quantification of perceptual experiences, understanding the relationship between an external stimulus, the internal representation and the response. In this paper, we propose a probabilistic framework to fuse the outcome of different psychophysical experimental protocols, namely rating and pairwise comparisons experiments. Such a method can be used for merging existing datasets of subjective nature and for experiments in which both measurements are collected. We analyze and compare the outcomes of both types of experimental protocols in terms of time and accuracy in a set of simulations and experiments with benchmark and real-world image quality assessment datasets, showing the necessity of scaling and the advantages of each protocol and mixing. Although most of our examples focus on image quality assessment, our findings generalize to any other subjective quality-of-experience task.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Algorithm for the Piecewise Affine-Linear Mumford-Shah Model Based on a Taylor Jet Splitting.","authors":"Lukas Kiefer, Martin Storath, Andreas Weinmann","doi":"10.1109/TIP.2019.2937040","DOIUrl":"10.1109/TIP.2019.2937040","url":null,"abstract":"<p><p>We propose an algorithm to efficiently compute approximate solutions of the piecewise affine Mumford-Shah model. The algorithm is based on a novel reformulation of the underlying optimization problem in terms of Taylor jets. A splitting approach leads to linewise segmented jet estimation problems for which we propose an exact and efficient solver. The proposed method has the combined advantages of prior algorithms: it directly yields a partition, it does not need an initialization procedure, and it is highly parallelizable. The experiments show that the algorithm has lower computation times and that the solutions often have lower functional values than the state-of-the-art.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}