2013 IEEE Conference on Computer Vision and Pattern Recognition最新文献

筛选
英文 中文
First-Person Activity Recognition: What Are They Doing to Me? 第一人称活动识别:他们对我做了什么?
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.352
M. Ryoo, L. Matthies
{"title":"First-Person Activity Recognition: What Are They Doing to Me?","authors":"M. Ryoo, L. Matthies","doi":"10.1109/CVPR.2013.352","DOIUrl":"https://doi.org/10.1109/CVPR.2013.352","url":null,"abstract":"This paper discusses the problem of recognizing interaction-level human activities from a first-person viewpoint. The goal is to enable an observer (e.g., a robot or a wearable camera) to understand 'what activity others are performing to it' from continuous video inputs. These include friendly interactions such as 'a person hugging the observer' as well as hostile interactions like 'punching the observer' or 'throwing objects to the observer', whose videos involve a large amount of camera ego-motion caused by physical interactions. The paper investigates multi-channel kernels to integrate global and local motion information, and presents a new activity learning/recognition methodology that explicitly considers temporal structures displayed in first-person activity videos. In our experiments, we not only show classification results with segmented videos, but also confirm that our new approach is able to detect activities from continuous videos reliably.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"116 1","pages":"2730-2737"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80421715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 287
A Fast Approximate AIB Algorithm for Distributional Word Clustering 分布式词聚类的快速近似AIB算法
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.78
Lei Wang, Jianjia Zhang, Luping Zhou, W. Li
{"title":"A Fast Approximate AIB Algorithm for Distributional Word Clustering","authors":"Lei Wang, Jianjia Zhang, Luping Zhou, W. Li","doi":"10.1109/CVPR.2013.78","DOIUrl":"https://doi.org/10.1109/CVPR.2013.78","url":null,"abstract":"Distributional word clustering merges the words having similar probability distributions to attain reliable parameter estimation, compact classification models and even better classification performance. Agglomerative Information Bottleneck (AIB) is one of the typical word clustering algorithms and has been applied to both traditional text classification and recent image recognition. Although enjoying theoretical elegance, AIB has one main issue on its computational efficiency, especially when clustering a large number of words. Different from existing solutions to this issue, we analyze the characteristics of its objective function-the loss of mutual information, and show that by merely using the ratio of word-class joint probabilities of each word, good candidate word pairs for merging can be easily identified. Based on this finding, we propose a fast approximate AIB algorithm and show that it can significantly improve the computational efficiency of AIB while well maintaining or even slightly increasing its classification performance. Experimental study on both text and image classification benchmark data sets shows that our algorithm can achieve more than 100 times speedup on large real data sets over the state-of-the-art method.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"58 1","pages":"556-563"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80504444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detection Evolution with Multi-order Contextual Co-occurrence 多阶上下文共现的检测进化
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.235
Guang Chen, Yuanyuan Ding, Jing Xiao, T. Han
{"title":"Detection Evolution with Multi-order Contextual Co-occurrence","authors":"Guang Chen, Yuanyuan Ding, Jing Xiao, T. Han","doi":"10.1109/CVPR.2013.235","DOIUrl":"https://doi.org/10.1109/CVPR.2013.235","url":null,"abstract":"Context has been playing an increasingly important role to improve the object detection performance. In this paper we propose an effective representation, Multi-Order Contextual co-Occurrence (MOCO), to implicitly model the high level context using solely detection responses from a baseline object detector. The so-called (1st-order) context feature is computed as a set of randomized binary comparisons on the response map of the baseline object detector. The statistics of the 1st-order binary context features are further calculated to construct a high order co-occurrence descriptor. Combining the MOCO feature with the original image feature, we can evolve the baseline object detector to a stronger context aware detector. With the updated detector, we can continue the evolution till the contextual improvements saturate. Using the successful deformable-part-model detector [13] as the baseline detector, we test the proposed MOCO evolution framework on the PASCAL VOC 2007 dataset [8] and Caltech pedestrian dataset [7]: The proposed MOCO detector outperforms all known state-of-the-art approaches, contextually boosting deformable part models (ver. 5) [13] by 3.3% in mean average precision on the PASCAL 2007 dataset. For the Caltech pedestrian dataset, our method further reduces the log-average miss rate from 48% to 46% and the miss rate at 1 FPPI from 25% to 23%, compared with the best prior art [6].","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"4 1","pages":"1798-1805"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81398816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
Supervised Semantic Gradient Extraction Using Linear-Time Optimization 基于线性时间优化的监督语义梯度提取
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.364
Shulin Yang, Jue Wang, L. Shapiro
{"title":"Supervised Semantic Gradient Extraction Using Linear-Time Optimization","authors":"Shulin Yang, Jue Wang, L. Shapiro","doi":"10.1109/CVPR.2013.364","DOIUrl":"https://doi.org/10.1109/CVPR.2013.364","url":null,"abstract":"This paper proposes a new supervised semantic edge and gradient extraction approach, which allows the user to roughly scribble over the desired region to extract semantically-dominant and coherent edges in it. Our approach first extracts low-level edge lets (small edge clusters) from the input image as primitives and build a graph upon them, by jointly considering both the geometric and appearance compatibility of edge lets. Given the characteristics of the graph, it cannot be effectively optimized by commonly-used energy minimization tools such as graph cuts. We thus propose an efficient linear algorithm for precise graph optimization, by taking advantage of the special structure of the graph. %Optimal parameter settings of the model are learnt from a dataset. Objective evaluations show that the proposed method significantly outperforms previous semantic edge detection algorithms. Finally, we demonstrate the effectiveness of the system in various image editing tasks.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"45 1","pages":"2826-2833"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81441017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
City-Scale Change Detection in Cadastral 3D Models Using Images 基于图像的地籍三维模型城市尺度变化检测
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.22
Aparna Taneja, Luca Ballan, M. Pollefeys
{"title":"City-Scale Change Detection in Cadastral 3D Models Using Images","authors":"Aparna Taneja, Luca Ballan, M. Pollefeys","doi":"10.1109/CVPR.2013.22","DOIUrl":"https://doi.org/10.1109/CVPR.2013.22","url":null,"abstract":"In this paper, we propose a method to detect changes in the geometry of a city using panoramic images captured by a car driving around the city. We designed our approach to account for all the challenges involved in a large scale application of change detection, such as, inaccuracies in the input geometry, errors in the geo-location data of the images, as well as, the limited amount of information due to sparse imagery. We evaluated our approach on an area of 6 square kilometers inside a city, using 3420 images downloaded from Google Street View. These images besides being publicly available, are also a good example of panoramic images captured with a driving vehicle, and hence demonstrating all the possible challenges resulting from such an acquisition. We also quantitatively compared the performance of our approach with respect to a ground truth, as well as to prior work. This evaluation shows that our approach outperforms the current state of the art.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"19 1","pages":"113-120"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89607666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 78
Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis 物理上可信的3D场景跟踪:单演员假设
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.9
Nikolaos Kyriazis, Antonis A. Argyros
{"title":"Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis","authors":"Nikolaos Kyriazis, Antonis A. Argyros","doi":"10.1109/CVPR.2013.9","DOIUrl":"https://doi.org/10.1109/CVPR.2013.9","url":null,"abstract":"In several hand-object(s) interaction scenarios, the change in the objects' state is a direct consequence of the hand's motion. This has a straightforward representation in Newtonian dynamics. We present the first approach that exploits this observation to perform model-based 3D tracking of a table-top scene comprising passive objects and an active hand. Our forward modelling of 3D hand-object(s) interaction regards both the appearance and the physical state of the scene and is parameterized over the hand motion (26 DoFs) between two successive instants in time. We demonstrate that our approach manages to track the 3D pose of all objects and the 3D pose and articulation of the hand by only searching for the parameters of the hand motion. In the proposed framework, covert scene state is inferred by connecting it to the overt state, through the incorporation of physics. Thus, our tracking approach treats a variety of challenging observability issues in a principled manner, without the need to resort to heuristics.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"62 1","pages":"9-16"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90369191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
Sparse Quantization for Patch Description 补丁描述的稀疏量化
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.366
X. Boix, Michael Gygli, G. Roig, L. Gool
{"title":"Sparse Quantization for Patch Description","authors":"X. Boix, Michael Gygli, G. Roig, L. Gool","doi":"10.1109/CVPR.2013.366","DOIUrl":"https://doi.org/10.1109/CVPR.2013.366","url":null,"abstract":"The representation of local image patches is crucial for the good performance and efficiency of many vision tasks. Patch descriptors have been designed to generalize towards diverse variations, depending on the application, as well as the desired compromise between accuracy and efficiency. We present a novel formulation of patch description, that serves such issues well. Sparse quantization lies at its heart. This allows for efficient encodings, leading to powerful, novel binary descriptors, yet also to the generalization of existing descriptors like SIFT or BRIEF. We demonstrate the capabilities of our formulation for both key point matching and image classification. Our binary descriptors achieve state-of-the-art results for two key point matching benchmarks, namely those by Brown and Mikolajczyk. For image classification, we propose new descriptors, that perform similar to SIFT on Caltech101 and PASCAL VOC07.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"46 1","pages":"2842-2849"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89957834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories 基于特征回归的类生成模型用于目标类别姿态估计
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.103
Michele Fenzi, L. Leal-Taixé, B. Rosenhahn, J. Ostermann
{"title":"Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories","authors":"Michele Fenzi, L. Leal-Taixé, B. Rosenhahn, J. Ostermann","doi":"10.1109/CVPR.2013.103","DOIUrl":"https://doi.org/10.1109/CVPR.2013.103","url":null,"abstract":"In this paper, we propose a method for learning a class representation that can return a continuous value for the pose of an unknown class instance using only 2D data and weak 3D labeling information. Our method is based on generative feature models, i.e., regression functions learned from local descriptors of the same patch collected under different viewpoints. The individual generative models are then clustered in order to create class generative models which form the class representation. At run-time, the pose of the query image is estimated in a maximum a posteriori fashion by combining the regression functions belonging to the matching clusters. We evaluate our approach on the EPFL car dataset and the Pointing'04 face dataset. Experimental results show that our method outperforms by 10% the state-of-the-art in the first dataset and by 9% in the second.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"71 1","pages":"755-762"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85211272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Studying Relationships between Human Gaze, Description, and Computer Vision 研究人类凝视、描述和计算机视觉之间的关系
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.101
Kiwon Yun, Yifan Peng, D. Samaras, G. Zelinsky, Tamara L. Berg
{"title":"Studying Relationships between Human Gaze, Description, and Computer Vision","authors":"Kiwon Yun, Yifan Peng, D. Samaras, G. Zelinsky, Tamara L. Berg","doi":"10.1109/CVPR.2013.101","DOIUrl":"https://doi.org/10.1109/CVPR.2013.101","url":null,"abstract":"We posit that user behavior during natural viewing of images contains an abundance of information about the content of images as well as information related to user intent and user defined content importance. In this paper, we conduct experiments to better understand the relationship between images, the eye movements people make while viewing images, and how people construct natural language to describe images. We explore these relationships in the context of two commonly used computer vision datasets. We then further relate human cues with outputs of current visual recognition systems and demonstrate prototype applications for gaze-enabled detection and annotation.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"739-746"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84478752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 88
Efficient Detector Adaptation for Object Detection in a Video 高效检测器自适应视频中目标检测
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.418
Pramod Sharma, R. Nevatia
{"title":"Efficient Detector Adaptation for Object Detection in a Video","authors":"Pramod Sharma, R. Nevatia","doi":"10.1109/CVPR.2013.418","DOIUrl":"https://doi.org/10.1109/CVPR.2013.418","url":null,"abstract":"In this work, we present a novel and efficient detector adaptation method which improves the performance of an offline trained classifier (baseline classifier) by adapting it to new test datasets. We address two critical aspects of adaptation methods: generalizability and computational efficiency. We propose an adaptation method, which can be applied to various baseline classifiers and is computationally efficient also. For a given test video, we collect online samples in an unsupervised manner and train a random fern adaptive classifier. The adaptive classifier improves precision of the baseline classifier by validating the obtained detection responses from baseline classifier as correct detections or false alarms. Experiments demonstrate generalizability, computational efficiency and effectiveness of our method, as we compare our method with state of the art approaches for the problem of human detection and show good performance with high computational efficiency on two different baseline classifiers.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"721 1","pages":"3254-3261"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83484946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信