{"title":"Multi-View Video Synopsis via Simultaneous Object-Shifting and View-Switching Optimization.","authors":"Zhensong Zhang, Yongwei Nie, Hanqiu Sun, Qing Zhang, Qiuxia Lai, Guiqing Li, Mingyu Xiao","doi":"10.1109/TIP.2019.2938086","DOIUrl":"10.1109/TIP.2019.2938086","url":null,"abstract":"<p><p>We present a method for synopsizing multiple videos captured by a set of surveillance cameras with some overlapped field-of-views. Currently, object-based approaches that directly shift objects along the time axis are already able to compute compact synopsis results for multiple surveillance videos. The challenge is how to present the multiple synopsis results in a more compact and understandable way. Previous approaches show them side by side on the screen, which however is difficult for user to comprehend. In this paper, we solve the problem by joint object-shifting and camera view-switching. Firstly, we synchronize the input videos, and group the same object in different videos together. Then we shift the groups of objects along the time axis to obtain multiple synopsis videos. Instead of showing them simultaneously, we just show one of them at each time, and allow to switch among the views of different synopsis videos. In this view switching way, we obtain just a single synopsis results consisting of content from all the input videos, which is much easier for user to follow and understand. To obtain the best synopsis result, we construct a simultaneous object-shifting and view-switching optimization framework instead of solving them separately. We also present an alternative optimization strategy composed of graph cuts and dynamic programming to solve the unified optimization. Experiments demonstrate that our single synopsis video generated from multiple input videos is compact, complete, and easy to understand.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62586355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Seismic Image Interpolation with Mathematical Morphological Constraint.","authors":"Weilin Huang, Jianxin Liu","doi":"10.1109/TIP.2019.2936744","DOIUrl":"10.1109/TIP.2019.2936744","url":null,"abstract":"<p><p>Seismic image interpolation is a currently popular research subject in modern reflection seismology. The interpolation problem is generally treated as a process of inversion. Under the compressed sensing framework, various sparse transformations and low-rank constraints based methods have great performances in recovering irregularly missing traces. However, in the case of regularly missing traces, their applications are limited because of the strong spatial aliasing energies. In addition, the erratic noise always poses a serious impact on the interpolation results obtained by the sparse transformations and low-rank constraints-based methods. This is because the erratic noise is far from satisfying the statistical assumption behind these methods. In this study, we propose a mathematical morphology-based interpolation technique, which constrains the morphological scale of the model in the inversion process. The inversion problem is solved by the shaping regularization approach. The mathematical morphological constraint (MMC)-based interpolation technique has a satisfactory robustness to the spatial aliasing and erratic energies. We provide a detailed algorithmic framework and discuss the extension from 2D to higher dimensional version and the back operator in the shaping inversion. A group of numerical examples demonstrates the successful performance of the proposed technique.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Perez-Ortiz, Aliaksei Mikhailiuk, Emin Zerman, Vedad Hulusic, Giuseppe Valenzise, Rafal K Mantiuk
{"title":"From pairwise comparisons and rating to a unified quality scale.","authors":"Maria Perez-Ortiz, Aliaksei Mikhailiuk, Emin Zerman, Vedad Hulusic, Giuseppe Valenzise, Rafal K Mantiuk","doi":"10.1109/TIP.2019.2936103","DOIUrl":"10.1109/TIP.2019.2936103","url":null,"abstract":"<p><p>The goal of psychometric scaling is the quantification of perceptual experiences, understanding the relationship between an external stimulus, the internal representation and the response. In this paper, we propose a probabilistic framework to fuse the outcome of different psychophysical experimental protocols, namely rating and pairwise comparisons experiments. Such a method can be used for merging existing datasets of subjective nature and for experiments in which both measurements are collected. We analyze and compare the outcomes of both types of experimental protocols in terms of time and accuracy in a set of simulations and experiments with benchmark and real-world image quality assessment datasets, showing the necessity of scaling and the advantages of each protocol and mixing. Although most of our examples focus on image quality assessment, our findings generalize to any other subjective quality-of-experience task.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Algorithm for the Piecewise Affine-Linear Mumford-Shah Model Based on a Taylor Jet Splitting.","authors":"Lukas Kiefer, Martin Storath, Andreas Weinmann","doi":"10.1109/TIP.2019.2937040","DOIUrl":"10.1109/TIP.2019.2937040","url":null,"abstract":"<p><p>We propose an algorithm to efficiently compute approximate solutions of the piecewise affine Mumford-Shah model. The algorithm is based on a novel reformulation of the underlying optimization problem in terms of Taylor jets. A splitting approach leads to linewise segmented jet estimation problems for which we propose an exact and efficient solver. The proposed method has the combined advantages of prior algorithms: it directly yields a partition, it does not need an initialization procedure, and it is highly parallelizable. The experiments show that the algorithm has lower computation times and that the solutions often have lower functional values than the state-of-the-art.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianjing Han, Xuemeng Song, Yiyang Yao, Xin-Shun Xu, Liqiang Nie
{"title":"Neural Compatibility Modeling with Probabilistic Knowledge Distillation.","authors":"Xianjing Han, Xuemeng Song, Yiyang Yao, Xin-Shun Xu, Liqiang Nie","doi":"10.1109/TIP.2019.2936742","DOIUrl":"10.1109/TIP.2019.2936742","url":null,"abstract":"<p><p>In modern society, clothing matching plays a pivotal role in people's daily life, as suitable outfits can beautify their appearance directly. Nevertheless, how to make a suitable outfit has become a daily headache for many people, especially those who do not have much sense of aesthetics. In the light of this, many research efforts have been dedicated to the task of complementary clothing matching and have achieved great success relying on the advanced data-driven neural networks. However, most existing methods overlook the rich valuable knowledge accumulated by our human beings in the fashion domain, especially the rules regarding clothing matching, like \"coats go with dresses\" and \"silk tops cannot go with chiffon bottoms\". Towards this end, in this work, we propose a knowledge-guided neural compatibility modeling scheme, which is able to incorporate the rich fashion domain knowledge to enhance the performance of the compatibility modeling in the context of clothing matching. To better integrate the huge and implicit fashion domain knowledge into the data-driven neural networks, we present a probabilistic knowledge distillation (PKD) method, which is able to encode vast knowledge rules in a probabilistic manner. Extensive experiments on two real-world datasets have verified the guidance of rules from different sources and demonstrated the effectiveness and portability of our model. As a byproduct, we released the codes and involved parameters to benefit the research community.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Gu, Junhui Hou, Huanqiang Zeng, Hui Yuan, Kai-Kuang Ma
{"title":"3D Point Cloud Attribute Compression Using Geometry-Guided Sparse Representation.","authors":"Shuai Gu, Junhui Hou, Huanqiang Zeng, Hui Yuan, Kai-Kuang Ma","doi":"10.1109/TIP.2019.2936738","DOIUrl":"10.1109/TIP.2019.2936738","url":null,"abstract":"<p><p>3D point clouds associated with attributes are considered as a promising paradigm for immersive communication. However, the corresponding compression schemes for this media are still in the infant stage. Moreover, in contrast to conventional image/video compression, it is a more challenging task to compress 3D point cloud data, arising from the irregular structure. In this paper, we propose a novel and effective compression scheme for the attributes of voxelized 3D point clouds. In the first stage, an input voxelized 3D point cloud is divided into blocks of equal size. Then, to deal with the irregular structure of 3D point clouds, a geometry-guided sparse representation (GSR) is proposed to eliminate the redundancy within each block, which is formulated as an ℓ0-norm regularized optimization problem. Also, an inter-block prediction scheme is applied to remove the redundancy between blocks. Finally, by quantitatively analyzing the characteristics of the resulting transform coefficients by GSR, an effective entropy coding strategy that is tailored to our GSR is developed to generate the bitstream. Experimental results over various benchmark datasets show that the proposed compression scheme is able to achieve better rate-distortion performance and visual quality, compared with state-of-the-art methods.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video Saliency Prediction using Spatiotemporal Residual Attentive Networks.","authors":"Qiuxia Lai, Wenguan Wang, Hanqiu Sun, Jianbing Shen","doi":"10.1109/TIP.2019.2936112","DOIUrl":"10.1109/TIP.2019.2936112","url":null,"abstract":"<p><p>This paper proposes a novel residual attentive learning network architecture for predicting dynamic eye-fixation maps. The proposed model emphasizes two essential issues, i.e, effective spatiotemporal feature integration and multi-scale saliency learning. For the first problem, appearance and motion streams are tightly coupled via dense residual cross connections, which integrate appearance information with multi-layer, comprehensive motion features in a residual and dense way. Beyond traditional two-stream models learning appearance and motion features separately, such design allows early, multi-path information exchange between different domains, leading to a unified and powerful spatiotemporal learning architecture. For the second one, we propose a composite attention mechanism that learns multi-scale local attentions and global attention priors end-to-end. It is used for enhancing the fused spatiotemporal features via emphasizing important features in multi-scales. A lightweight convolutional Gated Recurrent Unit (convGRU), which is flexible for small training data situation, is used for long-term temporal characteristics modeling. Extensive experiments over four benchmark datasets clearly demonstrate the advantage of the proposed video saliency model over other competitors and the effectiveness of each component of our network. Our code and all the results will be available at https://github.com/ashleylqx/STRA-Net.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunhang Shen, Rongrong Ji, Kuiyuan Yang, Cheng Deng, Changhu Wang
{"title":"Category-Aware Spatial Constraint for Weakly Supervised Detection.","authors":"Yunhang Shen, Rongrong Ji, Kuiyuan Yang, Cheng Deng, Changhu Wang","doi":"10.1109/TIP.2019.2933735","DOIUrl":"10.1109/TIP.2019.2933735","url":null,"abstract":"<p><p>Weakly supervised object detection has attracted increasing research attention recently. To this end, most existing schemes rely on scoring category-independent region proposals, which is formulated as a multiple instance learning problem. During this process, the proposal scores are aggregated and supervised by only image-level labels, which often fails to locate object boundaries precisely. In this paper, we break through such a restriction by taking a deeper look into the score aggregation stage and propose a Category-aware Spatial Constraint (CSC) scheme for proposals, which is integrated into weakly supervised object detection in an end-to-end learning manner. In particular, we incorporate the global shape information of objects as an unsupervised constraint, which is inferred from build-in foreground-and-background cues, termed Category-specific Pixel Gradient (CPG) maps. Specifically, each region proposal is weighted according to how well it covers the estimated shape of objects. For each category, a multi-center regularization is further introduced to penalize the violations between centers cluster and high-score proposals in a given image. Extensive experiments are done on the most widely-used benchmark Pascal VOC and COCO, which shows that our approach significantly improves weakly supervised object detection without adding new learnable parameters to the existing models nor changing the structures of CNNs.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62584864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Key-point Detector based on Sparse Coding.","authors":"Thanh Hong-Phuoc, Ling Guan","doi":"10.1109/TIP.2019.2934891","DOIUrl":"10.1109/TIP.2019.2934891","url":null,"abstract":"<p><p>Most popular hand-crafted key-point detectors such as Harris corner, MSER, SIFT, SURF rely on some specific pre-designed structures for detection of corners, blobs, or junctions in an image. The very nature of pre-designed structures can be considered a source of inflexibility for these detectors in different contexts. Additionally, the performance of these detectors is also highly affected by non-uniform change in illumination. To the best of our knowledge, while there are some previous works addressing one of the two aforementioned problems, there currently lacks an efficient method to solve both simultaneously. In this paper, we propose a novel Sparse Coding based Key-point detector (SCK) which is fully invariant to affine intensity change and independent of any particular structure. The proposed detector locates a key-point in an image, based on a complexity measure calculated from the block surrounding its position. A strength measure is also proposed for comparing and selecting the detected key-points when the maximum number of key-points is limited. In this paper, the desirable characteristics of the proposed detector are theoretically confirmed. Experimental results on three public datasets also show that the proposed detector achieves significantly high performance in terms of repeatability and matching score.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jichang Li, Si Wu, Cheng Liu, Zhiwen Yu, Hau-San Wong
{"title":"Semi-Supervised Deep Coupled Ensemble Learning with Classification Landmark Exploration.","authors":"Jichang Li, Si Wu, Cheng Liu, Zhiwen Yu, Hau-San Wong","doi":"10.1109/TIP.2019.2933724","DOIUrl":"10.1109/TIP.2019.2933724","url":null,"abstract":"<p><p>Using an ensemble of neural networks with consistency regularization is effective for improving performance and stability of deep learning, compared to the case of a single network. In this paper, we present a semi-supervised Deep Coupled Ensemble (DCE) model, which contributes to ensemble learning and classification landmark exploration for better locating the final decision boundaries in the learnt latent space. First, multiple complementary consistency regularizations are integrated into our DCE model to enable the ensemble members to learn from each other and themselves, such that training experience from different sources can be shared and utilized during training. Second, in view of the possibility of producing incorrect predictions on a number of difficult instances, we adopt class-wise mean feature matching to explore important unlabeled instances as classification landmarks, on which the model predictions are more reliable. Minimizing the weighted conditional entropy on unlabeled data is able to force the final decision boundaries to move away from important training data points, which facilitates semi-supervised learning. Ensemble members could eventually have similar performance due to consistency regularization, and thus only one of these members is needed during the test stage, such that the efficiency of our model is the same as the non-ensemble case. Extensive experimental results demonstrate the superiority of our proposed DCE model over existing state-of-the-art semi-supervised learning methods.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62584761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}