2013 IEEE International Conference on Computer Vision最新文献_第5页

Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers 判别原型与大边界最近邻分类器的联合学习

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.386

Martin Köstinger, Paul Wohlhart, P. Roth, H. Bischof

{"title":"Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers","authors":"Martin Köstinger, Paul Wohlhart, P. Roth, H. Bischof","doi":"10.1109/ICCV.2013.386","DOIUrl":"https://doi.org/10.1109/ICCV.2013.386","url":null,"abstract":"In this paper, we raise important issues concerning the evaluation complexity of existing Mahalanobis metric learning methods. The complexity scales linearly with the size of the dataset. This is especially cumbersome on large scale or for real-time applications with limited time budget. To alleviate this problem we propose to represent the dataset by a fixed number of discriminative prototypes. In particular, we introduce a new method that jointly chooses the positioning of prototypes and also optimizes the Mahalanobis distance metric with respect to these. We show that choosing the positioning of the prototypes and learning the metric in parallel leads to a drastically reduced evaluation effort while maintaining the discriminative essence of the original dataset. Moreover, for most problems our method performing k-nearest prototype (k-NP) classification on the condensed dataset leads to even better generalization compared to k-NN classification using all data. Results on a variety of challenging benchmarks demonstrate the power of our method. These include standard machine learning datasets as well as the challenging Public Figures Face Database. On the competitive machine learning benchmarks we are comparable to the state-of-the-art while being more efficient. On the face benchmark we clearly outperform the state-of-the-art in Mahalanobis metric learning with drastically reduced evaluation effort.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"14 1","pages":"3112-3119"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76760411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Stable Hyper-pooling and Query Expansion for Event Detection 事件检测的稳定超池和查询扩展

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.229

Matthijs Douze, Jérôme Revaud, C. Schmid, H. Jégou

引用次数: 37

Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies 寻找最佳路径:一种高效准确的图像层次分类器

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.40

Min Sun, Wanming Huang, S. Savarese

{"title":"Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies","authors":"Min Sun, Wanming Huang, S. Savarese","doi":"10.1109/ICCV.2013.40","DOIUrl":"https://doi.org/10.1109/ICCV.2013.40","url":null,"abstract":"Many methods have been proposed to solve the image classification problem for a large number of categories. Among them, methods based on tree-based representations achieve good trade-off between accuracy and test time efficiency. While focusing on learning a tree-shaped hierarchy and the corresponding set of classifiers, most of them [11, 2, 14] use a greedy prediction algorithm for test time efficiency. We argue that the dramatic decrease in accuracy at high efficiency is caused by the specific design choice of the learning and greedy prediction algorithms. In this work, we propose a classifier which achieves a better trade-off between efficiency and accuracy with a given tree-shaped hierarchy. First, we convert the classification problem as finding the best path in the hierarchy, and a novel branch-and-bound-like algorithm is introduced to efficiently search for the best path. Second, we jointly train the classifiers using a novel Structured SVM (SSVM) formulation with additional bound constraints. As a result, our method achieves a significant 4.65%, 5.43%, and 4.07% (relative 24.82%, 41.64%, and 109.79%) improvement in accuracy at high efficiency compared to state-of-the-art greedy \"tree-based\" methods [14] on Caltech-256 [15], SUN [32] and Image Net 1K [9] dataset, respectively. Finally, we show that our branch-and-bound-like algorithm naturally ranks the paths in the hierarchy (Fig. 8) so that users can further process them.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"28 1","pages":"265-272"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78193589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

What Do You Do? Occupation Recognition in a Photo via Social Context 你是怎么做的?基于社会语境的照片职业识别

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.451

Ming Shao, Liangyue Li, Y. Fu

引用次数: 31

Saliency Detection in Large Point Sets 大型点集的显著性检测

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.446

Elizabeth Shtrom, G. Leifman, A. Tal

引用次数: 67

Similarity Metric Learning for Face Recognition 人脸识别的相似度度量学习

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.299

Qiong Cao, Yiming Ying, Peng Li

引用次数: 207

Forward Motion Deblurring 向前运动去模糊

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.185

Shicheng Zheng, Li Xu, Jiaya Jia

引用次数: 47

Structured Forests for Fast Edge Detection 结构化森林快速边缘检测

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.231

Piotr Dollár, C. L. Zitnick

引用次数: 934

From Where and How to What We See 从哪里和如何到我们所看到的

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.83

S. Karthikeyan, V. Jagadeesh, Renuka Shenoy, M. Eckstein, B. S. Manjunath

{"title":"From Where and How to What We See","authors":"S. Karthikeyan, V. Jagadeesh, Renuka Shenoy, M. Eckstein, B. S. Manjunath","doi":"10.1109/ICCV.2013.83","DOIUrl":"https://doi.org/10.1109/ICCV.2013.83","url":null,"abstract":"Eye movement studies have confirmed that overt attention is highly biased towards faces and text regions in images. In this paper we explore a novel problem of predicting face and text regions in images using eye tracking data from multiple subjects. The problem is challenging as we aim to predict the semantics (face/text/background) only from eye tracking data without utilizing any image information. The proposed algorithm spatially clusters eye tracking data obtained in an image into different coherent groups and subsequently models the likelihood of the clusters containing faces and text using a fully connected Markov Random Field (MRF). Given the eye tracking data from a test image, it predicts potential face/head (humans, dogs and cats) and text locations reliably. Furthermore, the approach can be used to select regions of interest for further analysis by object detectors for faces and text. The hybrid eye position/object detector approach achieves better detection performance and reduced computation time compared to using only the object detection algorithm. We also present a new eye tracking dataset on 300 images selected from ICDAR, Street-view, Flickr and Oxford-IIIT Pet Dataset from 15 subjects.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"126 1","pages":"625-632"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74679659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

Dynamic Pooling for Complex Event Recognition 复杂事件识别的动态池

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.339

Wei-Xin Li, Qian Yu, Ajay Divakaran, N. Vasconcelos

{"title":"Dynamic Pooling for Complex Event Recognition","authors":"Wei-Xin Li, Qian Yu, Ajay Divakaran, N. Vasconcelos","doi":"10.1109/ICCV.2013.339","DOIUrl":"https://doi.org/10.1109/ICCV.2013.339","url":null,"abstract":"The problem of adaptively selecting pooling regions for the classification of complex video events is considered. Complex events are defined as events composed of several characteristic behaviors, whose temporal configuration can change from sequence to sequence. A dynamic pooling operator is defined so as to enable a unified solution to the problems of event specific video segmentation, temporal structure modeling, and event detection. Video is decomposed into segments, and the segments most informative for detecting a given event are identified, so as to dynamically determine the pooling operator most suited for each sequence. This dynamic pooling is implemented by treating the locations of characteristic segments as hidden information, which is inferred, on a sequence-by-sequence basis, via a large-margin classification rule with latent variables. Although the feasible set of segment selections is combinatorial, it is shown that a globally optimal solution to the inference problem can be obtained efficiently, through the solution of a series of linear programs. Besides the coarse-level location of segments, a finer model of video structure is implemented by jointly pooling features of segment-tuples. Experimental evaluation demonstrates that the resulting event detector has state-of-the-art performance on challenging video datasets.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"11 1","pages":"2728-2735"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73043738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 50