2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献_第4页

Using closed captions to train activity recognizers that improve video retrieval 使用封闭字幕训练活动识别器，提高视频检索

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204202

S. Gupta, R. Mooney

引用次数: 16

An implicit spatiotemporal shape model for human activity localization and recognition 人类活动定位与识别的隐式时空形状模型

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204262

A. Oikonomopoulos, I. Patras, M. Pantic

{"title":"An implicit spatiotemporal shape model for human activity localization and recognition","authors":"A. Oikonomopoulos, I. Patras, M. Pantic","doi":"10.1109/CVPRW.2009.5204262","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204262","url":null,"abstract":"In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic, sparse, `visual words' and `visual verbs'. Evidence for the spatiotemporal localization of the activity are accumulated in a probabilistic spatiotemporal voting scheme. The local nature of our voting framework allows us to recover multiple activities that take place in the same scene, as well as activities in the presence of clutter and occlusions. We construct class-specific codebooks using the descriptors in the training set, where we take the spatial co-occurrences of pairs of codewords into account. The positions of the codeword pairs with respect to the object centre, as well as the frame in the training set in which they occur are subsequently stored in order to create a spatiotemporal model of codeword co-occurrences. During the testing phase, we use mean shift mode estimation in order to spatially segment the subject that performs the activities in every frame, and the Radon transform in order to extract the most probable hypotheses concerning the temporal segmentation of the activities within the continuous stream.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Robust feature matching in 2.3µs 在2.3µs内实现鲁棒特征匹配

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204314

S. Taylor, E. Rosten, T. Drummond

{"title":"Robust feature matching in 2.3µs","authors":"S. Taylor, E. Rosten, T. Drummond","doi":"10.1109/CVPRW.2009.5204314","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204314","url":null,"abstract":"In this paper we present a robust feature matching scheme in which features can be matched in 2.3µs. For a typical task involving 150 features per image, this results in a processing time of 500µs for feature extraction and matching. In order to achieve very fast matching we use simple features based on histograms of pixel intensities and an indexing scheme based on their joint distribution. The features are stored with a novel bit mask representation which requires only 44 bytes of memory per feature and allows computation of a dissimilarity score in 20ns. A training phase gives the patch-based features invariance to small viewpoint variations. Larger viewpoint variations are handled by training entirely independent sets of features from different viewpoints. A complete system is presented where a database of around 13,000 features is used to robustly localise a single planar target in just over a millisecond, including all steps from feature detection to model fitting. The resulting system shows comparable robustness to SIFT [8] and Ferns [14] while using a tiny fraction of the processing time, and in the latter case a fraction of the memory as well.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125116364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 78

Learning a hierarchical compositional representation of multiple object classes 学习多个对象类的分层组合表示

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204332

A. Leonardis

引用次数: 0

Towards automated large scale discovery of image families 朝着自动大规模发现图像族的方向发展

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204177

M. Aly, P. Welinder, Mario E. Munich, P. Perona

引用次数: 23

Feature based person detection beyond the visible spectrum 超越可见光谱的基于特征的人检测

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204085

K. Jüngling, Michael Arens

引用次数: 62

Accurate estimation of pulmonary nodule's growth rate in CT images with nonrigid registration and precise nodule detection and segmentation 利用非刚性配准和精确的结节检测与分割，准确估计CT图像中肺结节的生长速度

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204050

Yuanjie Zheng, C. Kambhamettu, T. Bauer, K. Steiner

引用次数: 16

On conversion from color to gray-scale images for face detection 用于人脸检测的彩色图像到灰度图像的转换

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204297

Juwei Lu, K. Plataniotis

引用次数: 29

A syntax for image understanding 用于图像理解的语法

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204337

N. Ahuja

引用次数: 0

Multi-view reconstruction for projector camera systems based on bundle adjustment 基于束调整的投影摄像系统多视点重建

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI: 10.1109/CVPRW.2009.5204318

Furukawa Ryo, K. Inose, Hiroshi Kawasaki

{"title":"Multi-view reconstruction for projector camera systems based on bundle adjustment","authors":"Furukawa Ryo, K. Inose, Hiroshi Kawasaki","doi":"10.1109/CVPRW.2009.5204318","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204318","url":null,"abstract":"Range scanners using projector-camera systems have been studied actively in recent years as methods for measuring 3D shapes accurately and cost-effectively. To acquire an entire 3D shape of an object with such systems, the shape of the object should be captured from multiple directions and the set of captured shapes should be aligned using algorithms such as ICPs. Then, the aligned shapes are integrated into a single 3D shape model. However, the captured shapes are often distorted due to errors of intrinsic or extrinsic parameters of the camera and the projector. Because of these distortions, gaps between overlapped surfaces remain even after aligning the 3D shapes. In this paper, we propose a new method to capture an entire shape with high precision using an active stereo range scanner which consists of a projector and a camera with fixed relative positions. In the proposed method, minimization of calibration errors of the projector-camera pair and registration errors between 3D shapes from different viewpoints are simultaneously achieved. The proposed method can be considered as a variation of bundle adjustment techniques adapted to projector-camera systems. Since acquisition of correspondences between different views is not easy for projector-camera systems, a solution for the problem is also presented.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132854845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8