2017 IEEE International Conference on Computer Vision (ICCV)最新文献_第3页

Online Video Object Detection Using Association LSTM 基于关联LSTM的在线视频目标检测

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.257

Yongyi Lu, Cewu Lu, Chi-Keung Tang

引用次数: 101

From Square Pieces to Brick Walls: The Next Challenge in Solving Jigsaw Puzzles 从方块到砖墙:解决拼图游戏的下一个挑战

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.434

Shir Gur, O. Ben-Shahar

{"title":"From Square Pieces to Brick Walls: The Next Challenge in Solving Jigsaw Puzzles","authors":"Shir Gur, O. Ben-Shahar","doi":"10.1109/ICCV.2017.434","DOIUrl":"https://doi.org/10.1109/ICCV.2017.434","url":null,"abstract":"Research into computational jigsaw puzzle solving, an emerging theoretical problem with numerous applications, has focused in recent years on puzzles that constitute square pieces only. In this paper we wish to extend the scientific scope of appearance-based puzzle solving and consider ’’brick wall” jigsaw puzzles – rectangular pieces who may have different sizes, and could be placed next to each other at arbitrary offset along their abutting edge – a more explicit configuration with properties of real world puzzles. We present the new challenges that arise in brick wall puzzles and address them in two stages. First we concentrate on the reconstruction of the puzzle (with or without missing pieces) assuming an oracle for offset assignments. We show that despite the increased complexity of the problem, under these conditions performance can be made comparable to the state-of-the-art in solving the simpler square piece puzzles, and thereby argue that solving brick wall puzzles may be reduced to finding the correct offset between two neighboring pieces. We then move on to focus on implementing the oracle computationally using a mixture of dissimilarity metrics and correlation matching. We show results on various brick wall puzzles and discuss how our work may start a new research path for the puzzle solving community.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"5 1","pages":"4049-4057"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74256485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

DualNet: Learn Complementary Features for Image Recognition DualNet:学习图像识别的互补功能

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.62

Saihui Hou, X. Liu, Zilei Wang

{"title":"DualNet: Learn Complementary Features for Image Recognition","authors":"Saihui Hou, X. Liu, Zilei Wang","doi":"10.1109/ICCV.2017.62","DOIUrl":"https://doi.org/10.1109/ICCV.2017.62","url":null,"abstract":"In this work we propose a novel framework named Dual-Net aiming at learning more accurate representation for image recognition. Here two parallel neural networks are coordinated to learn complementary features and thus a wider network is constructed. Specifically, we logically divide an end-to-end deep convolutional neural network into two functional parts, i.e., feature extractor and image classifier. The extractors of two subnetworks are placed side by side, which exactly form the feature extractor of DualNet. Then the two-stream features are aggregated to the final classifier for overall classification, while two auxiliary classifiers are appended behind the feature extractor of each subnetwork to make the separately learned features discriminative alone. The complementary constraint is imposed by weighting the three classifiers, which is indeed the key of DualNet. The corresponding training strategy is also proposed, consisting of iterative training and joint finetuning, to make the two subnetworks cooperate well with each other. Finally, DualNet based on the well-known CaffeNet, VGGNet, NIN and ResNet are thoroughly investigated and experimentally evaluated on multiple datasets including CIFAR-100, Stanford Dogs and UEC FOOD-100. The results demonstrate that DualNet can really help learn more accurate image representation, and thus result in higher accuracy for recognition. In particular, the performance on CIFAR-100 is state-of-the-art compared to the recent works.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"123 1","pages":"502-510"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74797457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 71

Wavelet-SRNet: A Wavelet-Based CNN for Multi-scale Face Super Resolution 小波- srnet:一种基于小波的多尺度人脸超分辨率CNN

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.187

Huaibo Huang, R. He, Zhenan Sun, T. Tan

{"title":"Wavelet-SRNet: A Wavelet-Based CNN for Multi-scale Face Super Resolution","authors":"Huaibo Huang, R. He, Zhenan Sun, T. Tan","doi":"10.1109/ICCV.2017.187","DOIUrl":"https://doi.org/10.1109/ICCV.2017.187","url":null,"abstract":"Most modern face super-resolution methods resort to convolutional neural networks (CNN) to infer highresolution (HR) face images. When dealing with very low resolution (LR) images, the performance of these CNN based methods greatly degrades. Meanwhile, these methods tend to produce over-smoothed outputs and miss some textural details. To address these challenges, this paper presents a wavelet-based CNN approach that can ultra-resolve a very low resolution face image of 16 × 16 or smaller pixelsize to its larger version of multiple scaling factors (2×, 4×, 8× and even 16×) in a unified framework. Different from conventional CNN methods directly inferring HR images, our approach firstly learns to predict the LR’s corresponding series of HR’s wavelet coefficients before reconstructing HR images from them. To capture both global topology information and local texture details of human faces, we present a flexible and extensible convolutional neural network with three types of loss: wavelet prediction loss, texture loss and full-image loss. Extensive experiments demonstrate that the proposed approach achieves more appealing results both quantitatively and qualitatively than state-ofthe- art super-resolution methods.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"55 1","pages":"1698-1706"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84490771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 336

SHaPE: A Novel Graph Theoretic Algorithm for Making Consensus-Based Decisions in Person Re-identification Systems SHaPE:一种新的基于共识的人再识别系统决策图论算法

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.127

Arko Barman, S. Shah

{"title":"SHaPE: A Novel Graph Theoretic Algorithm for Making Consensus-Based Decisions in Person Re-identification Systems","authors":"Arko Barman, S. Shah","doi":"10.1109/ICCV.2017.127","DOIUrl":"https://doi.org/10.1109/ICCV.2017.127","url":null,"abstract":"Person re-identification is a challenge in video-based surveillance where the goal is to identify the same person in different camera views. In recent years, many algorithms have been proposed that approach this problem by designing suitable feature representations for images of persons or by training appropriate distance metrics that learn to distinguish between images of different persons. Aggregating the results from multiple algorithms for person re-identification is a relatively less-explored area of research. In this paper, we formulate an algorithm that maps the ranking process in a person re-identification algorithm to a problem in graph theory. We then extend this formulation to allow for the use of results from multiple algorithms to make a consensus-based decision for the person re-identification problem. The algorithm is unsupervised and takes into account only the matching scores generated by multiple algorithms for creating a consensus of results. Further, we show how the graph theoretic problem can be solved by a two-step process. First, we obtain a rough estimate of the solution using a greedy algorithm. Then, we extend the construction of the proposed graph so that the problem can be efficiently solved by means of Ant Colony Optimization, a heuristic path-searching algorithm for complex graphs. While we present the algorithm in the context of person reidentification, it can potentially be applied to the general problem of ranking items based on a consensus of multiple sets of scores or metric values.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"83 1","pages":"1124-1133"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77220775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Stepwise Metric Promotion for Unsupervised Video Person Re-identification 无监督视频人物再识别的逐步度量推广

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.266

Zimo Liu, D. Wang, Huchuan Lu

{"title":"Stepwise Metric Promotion for Unsupervised Video Person Re-identification","authors":"Zimo Liu, D. Wang, Huchuan Lu","doi":"10.1109/ICCV.2017.266","DOIUrl":"https://doi.org/10.1109/ICCV.2017.266","url":null,"abstract":"The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method. We start from two assumptions: 1) different video tracklets typically contain different persons, given that the tracklets are taken at distinct places or with long intervals; 2) within each tracklet, the frames are mostly of the same person. Based on these assumptions, this paper propose a stepwise metric promotion approach to estimate the identities of training tracklets, which iterates between cross-camera tracklet association and feature learning. Specifically, We use each training tracklet as a query, and perform retrieval in the cross-camera training set. Our method is built on reciprocal nearest neighbor search and can eliminate the hard negative label matches, i.e., the cross-camera nearest neighbors of the false matches in the initial rank list. The tracklet that passes the reciprocal nearest neighbor check is considered to have the same ID with the query. Experimental results on the PRID 2011, ILIDS-VID, and MARS datasets show that the proposed method achieves very competitive re-ID accuracy compared with its supervised counterparts.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"145 1","pages":"2448-2457"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78359174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 163

Cascaded Feature Network for Semantic Segmentation of RGB-D Images 用于RGB-D图像语义分割的级联特征网络

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.147

Di Lin, Guangyong Chen, D. Cohen-Or, P. Heng, Hui Huang

{"title":"Cascaded Feature Network for Semantic Segmentation of RGB-D Images","authors":"Di Lin, Guangyong Chen, D. Cohen-Or, P. Heng, Hui Huang","doi":"10.1109/ICCV.2017.147","DOIUrl":"https://doi.org/10.1109/ICCV.2017.147","url":null,"abstract":"Fully convolutional network (FCN) has been successfully applied in semantic segmentation of scenes represented with RGB images. Images augmented with depth channel provide more understanding of the geometric information of the scene in the image. The question is how to best exploit this additional information to improve the segmentation performance.,,In this paper, we present a neural network with multiple branches for segmenting RGB-D images. Our approach is to use the available depth to split the image into layers with common visual characteristic of objects/scenes, or common “scene-resolution”. We introduce context-aware receptive field (CaRF) which provides a better control on the relevant contextual information of the learned features. Equipped with CaRF, each branch of the network semantically segments relevant similar scene-resolution, leading to a more focused domain which is easier to learn. Furthermore, our network is cascaded with features from one branch augmenting the features of adjacent branch. We show that such cascading of features enriches the contextual information of each branch and enhances the overall performance. The accuracy that our network achieves outperforms the stateof-the-art methods on two public datasets.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"2014 1","pages":"1320-1328"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73294305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 107

Depth and Image Restoration from Light Field in a Scattering Medium 散射介质中光场的深度和图像恢复

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.263

Jiandong Tian, Zak Murez, Tong Cui, Zhen Zhang, D. Kriegman, R. Ramamoorthi

{"title":"Depth and Image Restoration from Light Field in a Scattering Medium","authors":"Jiandong Tian, Zak Murez, Tong Cui, Zhen Zhang, D. Kriegman, R. Ramamoorthi","doi":"10.1109/ICCV.2017.263","DOIUrl":"https://doi.org/10.1109/ICCV.2017.263","url":null,"abstract":"Traditional imaging methods and computer vision algorithms are often ineffective when images are acquired in scattering media, such as underwater, fog, and biological tissue. Here, we explore the use of light field imaging and algorithms for image restoration and depth estimation that address the image degradation from the medium. Towards this end, we make the following three contributions. First, we present a new single image restoration algorithm which removes backscatter and attenuation from images better than existing methods do, and apply it to each view in the light field. Second, we combine a novel transmission based depth cue with existing correspondence and defocus cues to improve light field depth estimation. In densely scattering media, our transmission depth cue is critical for depth estimation since the images have low signal to noise ratios which significantly degrades the performance of the correspondence and defocus cues. Finally, we propose shearing and refocusing multiple views of the light field to recover a single image of higher quality than what is possible from a single view. We demonstrate the benefits of our method through extensive experimental results in a water tank.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"2420-2429"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80435744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Following Gaze in Video 视频中的注视

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.160

Adrià Recasens, Carl Vondrick, A. Khosla, A. Torralba

引用次数: 65

Temporal Superpixels Based on Proximity-Weighted Patch Matching 基于邻近加权Patch匹配的时间超像素

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.390

Se-Ho Lee, Won-Dong Jang, Chang-Su Kim

引用次数: 9