2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献

筛选
英文 中文
Using closed captions to train activity recognizers that improve video retrieval 使用封闭字幕训练活动识别器,提高视频检索
S. Gupta, R. Mooney
{"title":"Using closed captions to train activity recognizers that improve video retrieval","authors":"S. Gupta, R. Mooney","doi":"10.1109/CVPRW.2009.5204202","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204202","url":null,"abstract":"Recognizing activities in real-world videos is a difficult problem exacerbated by background clutter, changes in camera angle & zoom, rapid camera movements etc. Large corpora of labeled videos can be used to train automated activity recognition systems, but this requires expensive human labor and time. This paper explores how closed captions that naturally accompany many videos can act as weak supervision that allows automatically collecting `labeled' data for activity recognition. We show that such an approach can improve activity retrieval in soccer videos. Our system requires no manual labeling of video clips and needs minimal human supervision. We also present a novel caption classifier that uses additional linguistic information to determine whether a specific comment refers to an on-going activity. We demonstrate that combining linguistic analysis and automatically trained activity recognizers can significantly improve the precision of video retrieval.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134622836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
An implicit spatiotemporal shape model for human activity localization and recognition 人类活动定位与识别的隐式时空形状模型
A. Oikonomopoulos, I. Patras, M. Pantic
{"title":"An implicit spatiotemporal shape model for human activity localization and recognition","authors":"A. Oikonomopoulos, I. Patras, M. Pantic","doi":"10.1109/CVPRW.2009.5204262","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204262","url":null,"abstract":"In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic, sparse, `visual words' and `visual verbs'. Evidence for the spatiotemporal localization of the activity are accumulated in a probabilistic spatiotemporal voting scheme. The local nature of our voting framework allows us to recover multiple activities that take place in the same scene, as well as activities in the presence of clutter and occlusions. We construct class-specific codebooks using the descriptors in the training set, where we take the spatial co-occurrences of pairs of codewords into account. The positions of the codeword pairs with respect to the object centre, as well as the frame in the training set in which they occur are subsequently stored in order to create a spatiotemporal model of codeword co-occurrences. During the testing phase, we use mean shift mode estimation in order to spatially segment the subject that performs the activities in every frame, and the Radon transform in order to extract the most probable hypotheses concerning the temporal segmentation of the activities within the continuous stream.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
GPU-accelerated, gradient-free MI deformable registration for atlas-based MR brain image segmentation 基于阿特拉斯的磁共振脑图像分割的gpu加速,无梯度MI可变形配准
Xiao Han, L. Hibbard, V. Willcut
{"title":"GPU-accelerated, gradient-free MI deformable registration for atlas-based MR brain image segmentation","authors":"Xiao Han, L. Hibbard, V. Willcut","doi":"10.1109/CVPRW.2009.5204043","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204043","url":null,"abstract":"Brain structure segmentation is an important task in many neuroscience and clinical applications. In this paper, we introduce a novel MI-based dense deformable registration method and apply it to the automatic segmentation of detailed brain structures. Together with a multiple atlas fusion strategy, very accurate segmentation results were obtained, as compared with other reported methods in the literature. To make multi-atlas segmentation computationally feasible, we also propose to take advantage of the recent advancements in GPU technology and introduce a GPU-based implementation of the proposed registration method. With GPU acceleration it takes less than 8 minutes to compile a multi-atlas segmentation for each subject even with as many as 17 atlases, which demonstrates that the use of GPUs can greatly facilitate the application of such atlas-based segmentation methods in practice.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114562914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Is there a general structure for grammars? 语法有一个通用的结构吗?
D. Mumford
{"title":"Is there a general structure for grammars?","authors":"D. Mumford","doi":"10.1109/CVPRW.2009.5204334","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204334","url":null,"abstract":"Summary form only given. Linguists have proposed dozens of formalisms for grammars and now vision is weighing in with its versions based on its needs. Ulf Grenander has proposed general pattern theory, and has used grammar-like graphical parses of \"thoughts\" in the style of AI. One wants a natural, simple formalism treating all these cases. I want to pose this as a central problem in modeling intelligence. Pattern theory started in the 70's with the ideas of Ulf Grenander and his school at Brown. The aim is to analyze from a statistical point of view the patterns in all \"signals\" generated by the world, whether they be images, sounds, written text, DNA or protein strings, spike trains in neurons, time series of prices or weather, etc. Pattern theory proposes that the types of patterns-and the hidden variables needed to describe these patterns - found in one class of signals will often be found in the others and that their characteristic variability will be similar. The underlying idea is to find classes of stochastic models which can capture all the patterns that we see in nature, so that random samples from these models have the same \"look and feel\" as the samples from the world itself. Then the detection of patterns in noisy and ambiguous samples can be achieved by the use of Bayes' rule, a method that can be described as \"analysis by synthesis\".","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123248761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3D stochastic completion fields for fiber tractography 纤维束成像的三维随机完井场
P. MomayyezSiahkal, Kaleem Siddiqi
{"title":"3D stochastic completion fields for fiber tractography","authors":"P. MomayyezSiahkal, Kaleem Siddiqi","doi":"10.1109/CVPRW.2009.5204044","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204044","url":null,"abstract":"We approach the problem of fiber tractography from the viewpoint that a computational theory should relate to the underlying quantity that is being measured - the diffusion of water molecules. We characterize the Brownian motion of water by a 3D random walk described by a stochastic non-linear differential equation. We show that the maximum-likelihood trajectories are 3D elastica, or curves of least energy. We illustrate the model with Monte-Carlo (sequential) simulations and then develop a more efficient (local, parallelizable) implementation, based on the Fokker-Planck equation. The final algorithm allows us to efficiently compute stochastic completion fields to connect a source region to a sink region, while taking into account the underlying diffusion MRI data. We demonstrate promising tractography results using high angular resolution diffusion data as input.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123565647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Nonparametric bottom-up saliency detection by self-resemblance 基于自相似的非参数自底向上显著性检测
H. Seo, P. Milanfar
{"title":"Nonparametric bottom-up saliency detection by self-resemblance","authors":"H. Seo, P. Milanfar","doi":"10.1109/CVPRW.2009.5204207","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204207","url":null,"abstract":"We present a novel bottom-up saliency detection algorithm. Our method computes so-called local regression kernels (i.e., local features) from the given image, which measure the likeness of a pixel to its surroundings. Visual saliency is then computed using the said “self-resemblance” measure. The framework results in a saliency map where each pixel indicates the statistical likelihood of saliency of a feature matrix given its surrounding feature matrices. As a similarity measure, matrix cosine similarity (a generalization of cosine similarity) is employed. State of the art performance is demonstrated on commonly used human eye fixation data [3] and some psychological patterns.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121938672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 116
Robust feature matching in 2.3µs 在2.3µs内实现鲁棒特征匹配
S. Taylor, E. Rosten, T. Drummond
{"title":"Robust feature matching in 2.3µs","authors":"S. Taylor, E. Rosten, T. Drummond","doi":"10.1109/CVPRW.2009.5204314","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204314","url":null,"abstract":"In this paper we present a robust feature matching scheme in which features can be matched in 2.3µs. For a typical task involving 150 features per image, this results in a processing time of 500µs for feature extraction and matching. In order to achieve very fast matching we use simple features based on histograms of pixel intensities and an indexing scheme based on their joint distribution. The features are stored with a novel bit mask representation which requires only 44 bytes of memory per feature and allows computation of a dissimilarity score in 20ns. A training phase gives the patch-based features invariance to small viewpoint variations. Larger viewpoint variations are handled by training entirely independent sets of features from different viewpoints. A complete system is presented where a database of around 13,000 features is used to robustly localise a single planar target in just over a millisecond, including all steps from feature detection to model fitting. The resulting system shows comparable robustness to SIFT [8] and Ferns [14] while using a tiny fraction of the processing time, and in the latter case a fraction of the memory as well.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125116364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 78
A method for selecting and ranking quality metrics for optimization of biometric recognition systems 一种用于优化生物特征识别系统的质量度量的选择和排序方法
N. Schmid, Francesco Nicolo
{"title":"A method for selecting and ranking quality metrics for optimization of biometric recognition systems","authors":"N. Schmid, Francesco Nicolo","doi":"10.1109/CVPRW.2009.5204309","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204309","url":null,"abstract":"In the field of biometrics evaluation of quality of biometric samples has a number of important applications. The main applications include (1) to reject poor quality images during acquisition, (2) to use as enhancement metric, and (3) to apply as a weighting factor in fusion schemes. Since a biometric-based recognition system relies on measures of performance such as matching scores and recognition probability of error, it becomes intuitive that the metrics evaluating biometric sample quality have to be linked to the recognition performance of the system. The goal of this work is to design a method for evaluating and ranking various quality metrics applied to biometric images or signals based on their ability to predict recognition performance of a biometric recognition system. The proposed method involves: (1) Preprocessing algorithm operating on pairs of quality scores and generating relative scores, (2) Adaptive multivariate mapping relating quality scores and measures of recognition performance and (3) Ranking algorithm that selects the best combinations of quality measures. The performance of the method is demonstrated on face and iris biometric data.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128244961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Multiple label prediction for image annotation with multiple Kernel correlation models 基于多核相关模型的图像标注多标签预测
Oksana Yakhnenko, Vasant G Honavar
{"title":"Multiple label prediction for image annotation with multiple Kernel correlation models","authors":"Oksana Yakhnenko, Vasant G Honavar","doi":"10.1109/CVPRW.2009.5204274","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204274","url":null,"abstract":"Image annotation is a challenging task that allows to correlate text keywords with an image. In this paper we address the problem of image annotation using Kernel Multiple Linear Regression model. Multiple Linear Regression (MLR) model reconstructs image caption from an image by performing a linear transformation of an image into some semantic space, and then recovers the caption by performing another linear transformation from the semantic space into the label space. The model is trained so that model parameters minimize the error of reconstruction directly. This model is related to Canonical Correlation Analysis (CCA) which maps both images and caption into the semantic space to minimize the distance of mapping in the semantic space. Kernel trick is then used for the MLR resulting in Kernel Multiple Linear Regression model. The solution to KMLR is a solution to the generalized eigen-value problem, related to KCCA (Kernel Canonical Correlation Analysis). We then extend Kernel Multiple Linear Regression and Kernel Canonical Correlation analysis models to multiple kernel setting, to allow various representations of images and captions. We present results for image annotation using Multiple Kernel Learning CCA and MLR on Oliva and Torralba (2001) scene recognition that show kernel selection behaviour.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130887149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
On conversion from color to gray-scale images for face detection 用于人脸检测的彩色图像到灰度图像的转换
Juwei Lu, K. Plataniotis
{"title":"On conversion from color to gray-scale images for face detection","authors":"Juwei Lu, K. Plataniotis","doi":"10.1109/CVPRW.2009.5204297","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204297","url":null,"abstract":"The paper presents a study on color to gray image conversion from a novel point of view: face detection. To the best knowledge of the authors, research in such a specific topic has not been conducted before. Our work reveals that the standard NTSC conversion is not optimal for face detection tasks, although it may be the best for use to display pictures on monochrome televisions. It is further found experimentally with two AdaBoost-based face detection systems that the detect rates may vary up to 10% by simply changing the parameters of the RGB to Gray conversion. On the other hand, the change has little influence on the false positive rates. Compared to the standard NTSC conversion, the detect rate with the best found parameter setting is 2.85% and 3.58% higher for the two evaluated face detection systems. Promisingly, the work suggests a new solution to the color to gray conversion. It could be extremely easy to be incorporated into most existing face detection systems for accuracy improvement without introduction of any extra cost in computational complexity.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132508637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信