IEEE Transactions on Image Processing最新文献_第9页

Neural Compatibility Modeling with Probabilistic Knowledge Distillation. 利用概率知识蒸馏建立神经兼容性模型。

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-27 DOI: 10.1109/TIP.2019.2936742

Xianjing Han, Xuemeng Song, Yiyang Yao, Xin-Shun Xu, Liqiang Nie

{"title":"Neural Compatibility Modeling with Probabilistic Knowledge Distillation.","authors":"Xianjing Han, Xuemeng Song, Yiyang Yao, Xin-Shun Xu, Liqiang Nie","doi":"10.1109/TIP.2019.2936742","DOIUrl":"10.1109/TIP.2019.2936742","url":null,"abstract":"In modern society, clothing matching plays a pivotal role in people's daily life, as suitable outfits can beautify their appearance directly. Nevertheless, how to make a suitable outfit has become a daily headache for many people, especially those who do not have much sense of aesthetics. In the light of this, many research efforts have been dedicated to the task of complementary clothing matching and have achieved great success relying on the advanced data-driven neural networks. However, most existing methods overlook the rich valuable knowledge accumulated by our human beings in the fashion domain, especially the rules regarding clothing matching, like \"coats go with dresses\" and \"silk tops cannot go with chiffon bottoms\". Towards this end, in this work, we propose a knowledge-guided neural compatibility modeling scheme, which is able to incorporate the rich fashion domain knowledge to enhance the performance of the compatibility modeling in the context of clothing matching. To better integrate the huge and implicit fashion domain knowledge into the data-driven neural networks, we present a probabilistic knowledge distillation (PKD) method, which is able to encode vast knowledge rules in a probabilistic manner. Extensive experiments on two real-world datasets have verified the guidance of rules from different sources and demonstrated the effectiveness and portability of our model. As a byproduct, we released the codes and involved parameters to benefit the research community.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

3D Point Cloud Attribute Compression Using Geometry-Guided Sparse Representation. 利用几何引导的稀疏表示压缩三维点云属性

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-27 DOI: 10.1109/TIP.2019.2936738

Shuai Gu, Junhui Hou, Huanqiang Zeng, Hui Yuan, Kai-Kuang Ma

{"title":"3D Point Cloud Attribute Compression Using Geometry-Guided Sparse Representation.","authors":"Shuai Gu, Junhui Hou, Huanqiang Zeng, Hui Yuan, Kai-Kuang Ma","doi":"10.1109/TIP.2019.2936738","DOIUrl":"10.1109/TIP.2019.2936738","url":null,"abstract":"3D point clouds associated with attributes are considered as a promising paradigm for immersive communication. However, the corresponding compression schemes for this media are still in the infant stage. Moreover, in contrast to conventional image/video compression, it is a more challenging task to compress 3D point cloud data, arising from the irregular structure. In this paper, we propose a novel and effective compression scheme for the attributes of voxelized 3D point clouds. In the first stage, an input voxelized 3D point cloud is divided into blocks of equal size. Then, to deal with the irregular structure of 3D point clouds, a geometry-guided sparse representation (GSR) is proposed to eliminate the redundancy within each block, which is formulated as an ℓ0-norm regularized optimization problem. Also, an inter-block prediction scheme is applied to remove the redundancy between blocks. Finally, by quantitatively analyzing the characteristics of the resulting transform coefficients by GSR, an effective entropy coding strategy that is tailored to our GSR is developed to generate the bitstream. Experimental results over various benchmark datasets show that the proposed compression scheme is able to achieve better rate-distortion performance and visual quality, compared with state-of-the-art methods.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Video Saliency Prediction using Spatiotemporal Residual Attentive Networks. 利用时空残留注意力网络进行视频显著性预测

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-23 DOI: 10.1109/TIP.2019.2936112

Qiuxia Lai, Wenguan Wang, Hanqiu Sun, Jianbing Shen

{"title":"Video Saliency Prediction using Spatiotemporal Residual Attentive Networks.","authors":"Qiuxia Lai, Wenguan Wang, Hanqiu Sun, Jianbing Shen","doi":"10.1109/TIP.2019.2936112","DOIUrl":"10.1109/TIP.2019.2936112","url":null,"abstract":"This paper proposes a novel residual attentive learning network architecture for predicting dynamic eye-fixation maps. The proposed model emphasizes two essential issues, i.e, effective spatiotemporal feature integration and multi-scale saliency learning. For the first problem, appearance and motion streams are tightly coupled via dense residual cross connections, which integrate appearance information with multi-layer, comprehensive motion features in a residual and dense way. Beyond traditional two-stream models learning appearance and motion features separately, such design allows early, multi-path information exchange between different domains, leading to a unified and powerful spatiotemporal learning architecture. For the second one, we propose a composite attention mechanism that learns multi-scale local attentions and global attention priors end-to-end. It is used for enhancing the fused spatiotemporal features via emphasizing important features in multi-scales. A lightweight convolutional Gated Recurrent Unit (convGRU), which is flexible for small training data situation, is used for long-term temporal characteristics modeling. Extensive experiments over four benchmark datasets clearly demonstrate the advantage of the proposed video saliency model over other competitors and the effectiveness of each component of our network. Our code and all the results will be available at https://github.com/ashleylqx/STRA-Net.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Category-Aware Spatial Constraint for Weakly Supervised Detection. 弱监督检测的类别感知空间约束

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-22 DOI: 10.1109/TIP.2019.2933735

Yunhang Shen, Rongrong Ji, Kuiyuan Yang, Cheng Deng, Changhu Wang

{"title":"Category-Aware Spatial Constraint for Weakly Supervised Detection.","authors":"Yunhang Shen, Rongrong Ji, Kuiyuan Yang, Cheng Deng, Changhu Wang","doi":"10.1109/TIP.2019.2933735","DOIUrl":"10.1109/TIP.2019.2933735","url":null,"abstract":"Weakly supervised object detection has attracted increasing research attention recently. To this end, most existing schemes rely on scoring category-independent region proposals, which is formulated as a multiple instance learning problem. During this process, the proposal scores are aggregated and supervised by only image-level labels, which often fails to locate object boundaries precisely. In this paper, we break through such a restriction by taking a deeper look into the score aggregation stage and propose a Category-aware Spatial Constraint (CSC) scheme for proposals, which is integrated into weakly supervised object detection in an end-to-end learning manner. In particular, we incorporate the global shape information of objects as an unsupervised constraint, which is inferred from build-in foreground-and-background cues, termed Category-specific Pixel Gradient (CPG) maps. Specifically, each region proposal is weighted according to how well it covers the estimated shape of objects. For each category, a multi-center regularization is further introduced to penalize the violations between centers cluster and high-score proposals in a given image. Extensive experiments are done on the most widely-used benchmark Pascal VOC and COCO, which shows that our approach significantly improves weakly supervised object detection without adding new learnable parameters to the existing models nor changing the structures of CNNs.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62584864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Novel Key-point Detector based on Sparse Coding. 基于稀疏编码的新型关键点检测器

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-19 DOI: 10.1109/TIP.2019.2934891

Thanh Hong-Phuoc, Ling Guan

{"title":"A Novel Key-point Detector based on Sparse Coding.","authors":"Thanh Hong-Phuoc, Ling Guan","doi":"10.1109/TIP.2019.2934891","DOIUrl":"10.1109/TIP.2019.2934891","url":null,"abstract":"Most popular hand-crafted key-point detectors such as Harris corner, MSER, SIFT, SURF rely on some specific pre-designed structures for detection of corners, blobs, or junctions in an image. The very nature of pre-designed structures can be considered a source of inflexibility for these detectors in different contexts. Additionally, the performance of these detectors is also highly affected by non-uniform change in illumination. To the best of our knowledge, while there are some previous works addressing one of the two aforementioned problems, there currently lacks an efficient method to solve both simultaneously. In this paper, we propose a novel Sparse Coding based Key-point detector (SCK) which is fully invariant to affine intensity change and independent of any particular structure. The proposed detector locates a key-point in an image, based on a complexity measure calculated from the block surrounding its position. A strength measure is also proposed for comparing and selecting the detected key-points when the maximum number of key-points is limited. In this paper, the desirable characteristics of the proposed detector are theoretically confirmed. Experimental results on three public datasets also show that the proposed detector achieves significantly high performance in terms of repeatability and matching score.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62585479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semi-Supervised Deep Coupled Ensemble Learning with Classification Landmark Exploration. 半监督深度耦合集合学习与分类地标探索

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-13 DOI: 10.1109/TIP.2019.2933724

Jichang Li, Si Wu, Cheng Liu, Zhiwen Yu, Hau-San Wong

{"title":"Semi-Supervised Deep Coupled Ensemble Learning with Classification Landmark Exploration.","authors":"Jichang Li, Si Wu, Cheng Liu, Zhiwen Yu, Hau-San Wong","doi":"10.1109/TIP.2019.2933724","DOIUrl":"10.1109/TIP.2019.2933724","url":null,"abstract":"Using an ensemble of neural networks with consistency regularization is effective for improving performance and stability of deep learning, compared to the case of a single network. In this paper, we present a semi-supervised Deep Coupled Ensemble (DCE) model, which contributes to ensemble learning and classification landmark exploration for better locating the final decision boundaries in the learnt latent space. First, multiple complementary consistency regularizations are integrated into our DCE model to enable the ensemble members to learn from each other and themselves, such that training experience from different sources can be shared and utilized during training. Second, in view of the possibility of producing incorrect predictions on a number of difficult instances, we adopt class-wise mean feature matching to explore important unlabeled instances as classification landmarks, on which the model predictions are more reliable. Minimizing the weighted conditional entropy on unlabeled data is able to force the final decision boundaries to move away from important training data points, which facilitates semi-supervised learning. Ensemble members could eventually have similar performance due to consistency regularization, and thus only one of these members is needed during the test stage, such that the efficiency of our model is the same as the non-ensemble case. Extensive experimental results demonstrate the superiority of our proposed DCE model over existing state-of-the-art semi-supervised learning methods.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62584761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning based Picture-Wise Just Noticeable Distortion Prediction Model for Image Compression. 基于深度学习的图像压缩 "画中画 "畸变预测模型

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-13 DOI: 10.1109/TIP.2019.2933743

Huanhua Liu, Yun Zhang, Huan Zhang, Chunling Fan, Sam Kwong, C-C Jay Kuo, Xiaoping Fan

{"title":"Deep Learning based Picture-Wise Just Noticeable Distortion Prediction Model for Image Compression.","authors":"Huanhua Liu, Yun Zhang, Huan Zhang, Chunling Fan, Sam Kwong, C-C Jay Kuo, Xiaoping Fan","doi":"10.1109/TIP.2019.2933743","DOIUrl":"10.1109/TIP.2019.2933743","url":null,"abstract":"Picture Wise Just Noticeable Difference (PW-JND), which accounts for the minimum difference of a picture that human visual system can perceive, can be widely used in perception-oriented image and video processing. However, the conventional Just Noticeable Difference (JND) models calculate the JND threshold for each pixel or sub-band separately, which may not reflect the total masking effect of a picture accurately. In this paper, we propose a deep learning based PW-JND prediction model for image compression. Firstly, we formulate the task of predicting PW-JND as a multi-class classification problem, and propose a framework to transform the multi-class classification problem to a binary classification problem solved by just one binary classifier. Secondly, we construct a deep learning based binary classifier named perceptually lossy/lossless predictor which can predict whether an image is perceptually lossy to another or not. Finally, we propose a sliding window based search strategy to predict PW-JND based on the prediction results of the perceptually lossy/lossless predictor. Experimental results show that the mean accuracy of the perceptually lossy/lossless predictor reaches 92%, and the absolute prediction error of the proposed PW-JND model is 0.79 dB on average, which shows the superiority of the proposed PW-JND model to the conventional JND models.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62584735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiresolution Localization with Temporal Scanning for Super-Resolution Diffuse Optical Imaging of Fluorescence. 利用时间扫描进行多分辨率定位，实现荧光的超分辨率漫反射光学成像。

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-12 DOI: 10.1109/TIP.2019.2931080

Brian Z Bentz, Dergan Lin, Justin A Patel, Kevin J Webb

引用次数: 0

Summit Navigator: A Novel Approach for Local Maxima Extraction. 高峰导航仪：提取局部最大值的新方法

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-12 DOI: 10.1109/TIP.2019.2932501

Tran Hiep Dinh, Manh Duong Phung, Quang Phuc Ha

引用次数: 0

Multi-Task Deep Relative Attribute Learning for Visual Urban Perception. 针对城市视觉感知的多任务深度相对属性学习

IF 10.6 1区计算机科学

IEEE Transactions on Image Processing Pub Date : 2019-08-07 DOI: 10.1109/TIP.2019.2932502

Weiqing Min, Shuhuan Mei, Linhu Liu, Yi Wang, Shuqiang Jiang

{"title":"Multi-Task Deep Relative Attribute Learning for Visual Urban Perception.","authors":"Weiqing Min, Shuhuan Mei, Linhu Liu, Yi Wang, Shuqiang Jiang","doi":"10.1109/TIP.2019.2932502","DOIUrl":"10.1109/TIP.2019.2932502","url":null,"abstract":"Visual urban perception aims to quantify perceptual attributes (e.g., safe and depressing attributes) of physical urban environment from crowd-sourced street-view images and their pairwise comparisons. It has been receiving more and more attention in computer vision for various applications, such as perceptive attribute learning and urban scene understanding. Most existing methods adopt either (i) a regression model trained using image features and ranked scores converted from pairwise comparisons for perceptual attribute prediction or (ii) a pairwise ranking algorithm to independently learn each perceptual attribute. However, the former fails to directly exploit pairwise comparisons while the latter ignores the relationship among different attributes. To address them, we propose a Multi-Task Deep Relative Attribute Learning Network (MTDRALN) to learn all the relative attributes simultaneously via multi-task Siamese networks, where each Siamese network will predict one relative attribute. Combined with deep relative attribute learning, we utilize the structured sparsity to exploit the prior from natural attribute grouping, where all the attributes are divided into different groups based on semantic relatedness in advance. As a result, MTDRALN is capable of learning all the perceptual attributes simultaneously via multi-task learning. Besides the ranking sub-network, MTDRALN further introduces the classification sub-network, and these two types of losses from two sub-networks jointly constrain parameters of the deep network to make the network learn more discriminative visual features for relative attribute learning. In addition, our network can be trained in an end-to-end way to make deep feature learning and multi-task relative attribute learning reinforce each other. Extensive experiments on the large-scale Place Pulse 2.0 dataset validate the advantage of our proposed network. Our qualitative results along with visualization of saliency maps also show that the proposed network is able to learn effective features for perceptual attributes.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2019-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62584443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0