2011 International Conference on Computer Vision最新文献_第4页

A selective spatio-temporal interest point detector for human action recognition in complex scenes 一种用于复杂场景中人类动作识别的选择性时空兴趣点检测器

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126443

Bhaskar Chakraborty, M. B. Holte, T. Moeslund, Jordi Gonzàlez, F. X. Roca

{"title":"A selective spatio-temporal interest point detector for human action recognition in complex scenes","authors":"Bhaskar Chakraborty, M. B. Holte, T. Moeslund, Jordi Gonzàlez, F. X. Roca","doi":"10.1109/ICCV.2011.6126443","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126443","url":null,"abstract":"Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"1776-1783"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79874981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49

Diagonal preconditioning for first order primal-dual algorithms in convex optimization 凸优化中一阶原对偶算法的对角预处理

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126441

T. Pock, A. Chambolle

引用次数: 461

A dimensionality result for multiple homography matrices 多个单应矩阵的维数结果

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126485

W. Chojnacki, A. Hengel

引用次数: 6

Level-set person segmentation and tracking with multi-region appearance models and top-down shape information 基于多区域外观模型和自顶向下形状信息的水平集人分割与跟踪

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126455

Esther Horbert, Konstantinos Rematas, B. Leibe

引用次数: 40

Salient object detection by composition 基于组合的显著目标检测

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126348

J. Feng, Yichen Wei, Litian Tao, Chao Zhang, Jian Sun

引用次数: 166

The power of comparative reasoning 比较推理的力量

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126527

J. Yagnik, Dennis W. Strelow, David A. Ross, Ruei-Sung Lin

{"title":"The power of comparative reasoning","authors":"J. Yagnik, Dennis W. Strelow, David A. Ross, Ruei-Sung Lin","doi":"10.1109/ICCV.2011.6126527","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126527","url":null,"abstract":"Rank correlation measures are known for their resilience to perturbations in numeric values and are widely used in many evaluation metrics. Such ordinal measures have rarely been applied in treatment of numeric features as a representational transformation. We emphasize the benefits of ordinal representations of input features both theoretically and empirically. We present a family of algorithms for computing ordinal embeddings based on partial order statistics. Apart from having the stability benefits of ordinal measures, these embeddings are highly nonlinear, giving rise to sparse feature spaces highly favored by several machine learning methods. These embeddings are deterministic, data independent and by virtue of being based on partial order statistics, add another degree of resilience to noise. These machine-learning-free methods when applied to the task of fast similarity search outperform state-of-the-art machine learning methods with complex optimization setups. For solving classification problems, the embeddings provide a nonlinear transformation resulting in sparse binary codes that are well-suited for a large class of machine learning algorithms. These methods show significant improvement on VOC 2010 using simple linear classifiers which can be trained quickly. Our method can be extended to the case of polynomial kernels, while permitting very efficient computation. Further, since the popular Min Hash algorithm is a special case of our method, we demonstrate an efficient scheme for computing Min Hash on conjunctions of binary features. The actual method can be implemented in about 10 lines of code in most languages (2 lines in MAT-LAB), and does not require any data-driven optimization.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"44 1","pages":"2431-2438"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82558505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 129

Multi-view 3D reconstruction for scenes under the refractive plane with known vertical direction 垂直方向已知的折射率平面下场景的多视图三维重建

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126262

Yao-Jen Chang, Tsuhan Chen

{"title":"Multi-view 3D reconstruction for scenes under the refractive plane with known vertical direction","authors":"Yao-Jen Chang, Tsuhan Chen","doi":"10.1109/ICCV.2011.6126262","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126262","url":null,"abstract":"Images taken from scenes under water suffer distortion due to refraction. While refraction causes magnification with mild distortion on the observed images, severe distortions in geometry reconstruction would be resulted if the refractive distortion is not properly handled. Different from the radial distortion model, the refractive distortion depends on the scene depth seen from each light ray as well as the camera pose relative to the refractive surface. Therefore, it's crucial to obtain a good estimate of scene depth, camera pose and optical center to alleviate the impact of refractive distortion. In this work, we formulate the forward and back projections of light rays involving a refractive plane for the perspective camera model by explicitly modeling refractive distortion as a function of depth. Furthermore, for cameras with an inertial measurement unit (IMU), we show that a linear solution to the relative pose and a closed-form solution to the absolute pose can be derived with known camera vertical directions. We incorporate our formulations with the general structure from motion framework followed by the patch-based multiview stereo algorithm to obtain a 3D reconstruction of the scene. We show through experiments that the explicit modeling of depth-dependent refractive distortion physically leads to more accurate scene reconstructions.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"19 1","pages":"351-358"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83012786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 66

What characterizes a shadow boundary under the sun and sky? 太阳和天空下的阴影边界的特征是什么?

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126331

Xiang Huang, G. Hua, J. Tumblin, Lance Williams

引用次数: 78

Video Primal Sketch: A generic middle-level representation of video 视频原始草图:视频的一般中层表示

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126380

Zhi Han, Zongben Xu, Song-Chun Zhu

引用次数: 12

Inferring human gaze from appearance via adaptive linear regression 通过自适应线性回归从外表推断人类的凝视

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126237

Feng Lu, Yusuke Sugano, Takahiro Okabe, Yoichi Sato

{"title":"Inferring human gaze from appearance via adaptive linear regression","authors":"Feng Lu, Yusuke Sugano, Takahiro Okabe, Yoichi Sato","doi":"10.1109/ICCV.2011.6126237","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126237","url":null,"abstract":"The problem of estimating human gaze from eye appearance is regarded as mapping high-dimensional features to low-dimensional target space. Conventional methods require densely obtained training samples on the eye appearance manifold, which results in a tedious calibration stage. In this paper, we introduce an adaptive linear regression (ALR) method for accurate mapping via sparsely collected training samples. The key idea is to adaptively find the subset of training samples where the test sample is most linearly representable. We solve the problem via l1-optimization and thoroughly study the key issues to seek for the best solution for regression. The proposed gaze estimation approach based on ALR is naturally sparse and low-dimensional, giving the ability to infer human gaze from variant resolution eye images using much fewer training samples than existing methods. Especially, the optimization procedure in ALR is extended to solve the subpixel alignment problem simultaneously for low resolution test eye images. Performance of the proposed method is evaluated by extensive experiments against various factors such as number of training samples, feature dimensionality and eye image resolution to verify its effectiveness.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"153-160"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90994607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 133