2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)最新文献_第2页

A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos 具有高分辨率图像和多摄像头视频的多视图立体基准

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.272

Thomas Schöps, Johannes L. Schönberger, S. Galliani, Torsten Sattler, K. Schindler, M. Pollefeys, Andreas Geiger

引用次数: 518

On Compressing Deep Models by Low Rank and Sparse Decomposition 基于低秩和稀疏分解的深度模型压缩

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.15

Xiyu Yu, Tongliang Liu, Xinchao Wang, D. Tao

{"title":"On Compressing Deep Models by Low Rank and Sparse Decomposition","authors":"Xiyu Yu, Tongliang Liu, Xinchao Wang, D. Tao","doi":"10.1109/CVPR.2017.15","DOIUrl":"https://doi.org/10.1109/CVPR.2017.15","url":null,"abstract":"Deep compression refers to removing the redundancy of parameters and feature maps for deep learning models. Low-rank approximation and pruning for sparse structures play a vital role in many compression works. However, weight filters tend to be both low-rank and sparse. Neglecting either part of these structure information in previous methods results in iteratively retraining, compromising accuracy, and low compression rates. Here we propose a unified framework integrating the low-rank and sparse decomposition of weight matrices with the feature map reconstructions. Our model includes methods like pruning connections as special cases, and is optimized by a fast SVD-free algorithm. It has been theoretically proven that, with a small sample, due to its generalizability, our model can well reconstruct the feature maps on both training and test data, which results in less compromising accuracy prior to the subsequent retraining. With such a warm start to retrain, the compression method always possesses several merits: (a) higher compression rates, (b) little loss of accuracy, and (c) fewer rounds to compress deep models. The experimental results on several popular models such as AlexNet, VGG-16, and GoogLeNet show that our model can significantly reduce the parameters for both convolutional and fully-connected layers. As a result, our model reduces the size of VGG-16 by 15×, better than other recent compression methods that use a single strategy.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"300 1","pages":"67-76"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77421516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 349

Efficient Multiple Instance Metric Learning Using Weakly Supervised Data 基于弱监督数据的高效多实例度量学习

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.630

M. Law, Yaoliang Yu, R. Urtasun, R. Zemel, E. Xing

引用次数: 13

Fast Boosting Based Detection Using Scale Invariant Multimodal Multiresolution Filtered Features 基于尺度不变多模态多分辨率滤波特征的快速增强检测

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.112

A. Costea, R. Varga, S. Nedevschi

引用次数: 17

Scene Parsing through ADE20K Dataset 通过ADE20K数据集进行场景解析

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.544

Bolei Zhou, Hang Zhao, Xavier Puig, S. Fidler, Adela Barriuso, A. Torralba

引用次数: 2156

Switching Convolutional Neural Network for Crowd Counting 切换卷积神经网络用于人群计数

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.429

Deepak Babu Sam, Shiv Surya, R. Venkatesh Babu

{"title":"Switching Convolutional Neural Network for Crowd Counting","authors":"Deepak Babu Sam, Shiv Surya, R. Venkatesh Babu","doi":"10.1109/CVPR.2017.429","DOIUrl":"https://doi.org/10.1109/CVPR.2017.429","url":null,"abstract":"We propose a novel crowd counting model that maps a given crowd scene to its density. Crowd analysis is compounded by myriad of factors like inter-occlusion between people due to extreme crowding, high similarity of appearance between people and background elements, and large variability of camera view-points. Current state-of-the art approaches tackle these factors by using multi-scale CNN architectures, recurrent networks and late fusion of features from multi-column CNN with different receptive fields. We propose switching convolutional neural network that leverages variation of crowd density within an image to improve the accuracy and localization of the predicted crowd count. Patches from a grid within a crowd scene are relayed to independent CNN regressors based on crowd count prediction quality of the CNN established during training. The independent CNN regressors are designed to have different receptive fields and a switch classifier is trained to relay the crowd scene patch to the best CNN regressor. We perform extensive experiments on all major crowd counting datasets and evidence better performance compared to current state-of-the-art methods. We provide interpretable representations of the multichotomy of space of crowd scene patches inferred from the switch. It is observed that the switch relays an image patch to a particular CNN column based on density of crowd.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"1 1","pages":"4031-4039"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85418611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 792

On the Effectiveness of Visible Watermarks 关于可见水印的有效性

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.726

Tali Dekel, Michael Rubinstein, Ce Liu, W. Freeman

{"title":"On the Effectiveness of Visible Watermarks","authors":"Tali Dekel, Michael Rubinstein, Ce Liu, W. Freeman","doi":"10.1109/CVPR.2017.726","DOIUrl":"https://doi.org/10.1109/CVPR.2017.726","url":null,"abstract":"Visible watermarking is a widely-used technique for marking and protecting copyrights of many millions of images on the web, yet it suffers from an inherent security flaw—watermarks are typically added in a consistent manner to many images. We show that this consistency allows to automatically estimate the watermark and recover the original images with high accuracy. Specifically, we present a generalized multi-image matting algorithm that takes a watermarked image collection as input and automatically estimates the foreground (watermark), its alpha matte, and the background (original) images. Since such an attack relies on the consistency of watermarks across image collection, we explore and evaluate how it is affected by various types of inconsistencies in the watermark embedding that could potentially be used to make watermarking more secured. We demonstrate the algorithm on stock imagery available on the web, and provide extensive quantitative analysis on synthetic watermarked data. A key takeaway message of this paper is that visible watermarks should be designed to not only be robust against removal from a single image, but to be more resistant to mass-scale removal from image collections as well.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"103 1","pages":"6864-6872"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85844360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 45

Deeply Supervised Salient Object Detection with Short Connections 基于短连接的深度监督显著目标检测

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.563

Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, A. Borji, Z. Tu, Philip H. S. Torr

{"title":"Deeply Supervised Salient Object Detection with Short Connections","authors":"Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, A. Borji, Z. Tu, Philip H. S. Torr","doi":"10.1109/CVPR.2017.563","DOIUrl":"https://doi.org/10.1109/CVPR.2017.563","url":null,"abstract":"Recent progress on saliency detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and saliency detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. Holisitcally-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on saliency detection is not obvious. In this paper, we propose a new saliency method by introducing short connections to the skip-layer structures within the HED architecture. Our framework provides rich multi-scale feature maps at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.08 seconds per image), effectiveness, and simplicity over the existing algorithms.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"93 1","pages":"5300-5309"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76029187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 125

A Dataset for Benchmarking Image-Based Localization 基于图像的定位基准数据集

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.598

Xun Sun, Yuanfan Xie, Peiwen Luo, Liang Wang

{"title":"A Dataset for Benchmarking Image-Based Localization","authors":"Xun Sun, Yuanfan Xie, Peiwen Luo, Liang Wang","doi":"10.1109/CVPR.2017.598","DOIUrl":"https://doi.org/10.1109/CVPR.2017.598","url":null,"abstract":"A novel dataset for benchmarking image-based localization is presented. With increasing research interests in visual place recognition and localization, several datasets have been published in the past few years. One of the evident limitations of existing datasets is that precise ground truth camera poses of query images are not available in a meaningful 3D metric system. This is in part due to the underlying 3D models of these datasets are reconstructed from Structure from Motion methods. So far little attention has been paid to metric evaluations of localization accuracy. In this paper we address the problem of whether state-of-the-art visual localization techniques can be applied to tasks with demanding accuracy requirements. We acquired training data for a large indoor environment with cameras and a LiDAR scanner. In addition, we collected over 2000 query images with cell phone cameras. Using LiDAR point clouds as a reference, we employed a semi-automatic approach to estimate the 6 degrees of freedom camera poses precisely in the world coordinate system. The proposed dataset enables us to quantitatively assess the performance of various algorithms using a fair and intuitive metric.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"8 1","pages":"5641-5649"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87882579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Exclusivity-Consistency Regularized Multi-view Subspace Clustering 排他-一致性正则化多视图子空间聚类

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.8

Xiaobo Wang, Xiaojie Guo, Zhen Lei, Changqing Zhang, S. Li

引用次数: 169