Kshitiz Kumar, Jirí Navrátil, E. Marcheret, V. Libal, G. Ramaswamy, G. Potamianos
{"title":"Audio-visual speech synchronization detection using a bimodal linear prediction model","authors":"Kshitiz Kumar, Jirí Navrátil, E. Marcheret, V. Libal, G. Ramaswamy, G. Potamianos","doi":"10.1109/CVPRW.2009.5204303","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204303","url":null,"abstract":"In this work, we study the problem of detecting audio-visual (AV) synchronization in video segments containing a speaker in frontal head pose. The problem holds important applications in biometrics, for example spoofing detection, and it constitutes an important step in AV segmentation necessary for deriving AV fingerprints in multimodal speaker recognition. To attack the problem, we propose a time-evolution model for AV features and derive an analytical approach to capture the notion of synchronization between them. We report results on an appropriate AV database, using two types of visual features extracted from the speaker's facial area: geometric ones and features based on the discrete cosine image transform. Our results demonstrate that the proposed approach provides substantially better AV synchrony detection over a baseline method that employs mutual information, with the geometric visual features outperforming the image transform ones.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123318752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Color calibration of multi-projector displays through automatic optimization of hardware settings","authors":"R. M. Steele, Mao Ye, Ruigang Yang","doi":"10.1109/CVPRW.2009.5204322","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204322","url":null,"abstract":"We describe a system that performs automatic, camera-based photometric projector calibration by adjusting hardware settings (e.g. brightness, contrast, etc.). The approach has two basic advantages over software-correction methods. First, there is no software interface imposed on graphical programs: all imagery displayed on the projector benefits from the calibration immediately, without render-time overhead or code changes. Secondly, the approach benefits from the fact that projector hardware settings typically are capable of expanding or shifting color gamuts (e.g. trading off maximum brightness versus darkness of black levels), something that software methods, which only shrink gamuts, cannot do. In practice this means that hardware settings can possibly match colors between projectors while maintaining a larger overall color gamut (e.g. better contrast) than software-only correction can. The prototype system is fully automatic. The space of hardware settings is explored by using a computer-controlled universal remote to navigate each projector's menu system. An off-the-shelf camera observes each projector's response curves. A cost function is computed for the curves based on their similarity to each other, as well as intrinsic characteristics, including color balance, black level, gamma, and dynamic range. An approximate optimum is found using a heuristic combinatoric search. Results show significant qualitative improvements in the absolute colors, as well as the color consistency, of the display.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126388389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning to segment using machine-learned penalized logistic models","authors":"Yong Yue, H. Tagare","doi":"10.1109/CVPRW.2009.5204343","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204343","url":null,"abstract":"Classical maximum-a-posteriori (MAP) segmentation uses generative models for images. However, creating tractable generative models can be difficult for complex images. Moreover, generative models require auxiliary parameters to be included in the maximization, which makes the maximization more complicated. This paper proposes an alternative to the MAP approach: using a penalized logistic model to directly model the segmentation posterior. This approach has two advantages: (1) It requires fewer auxiliary parameters, and (2) it provides a standard way of incorporating powerful machine-learning methods into segmentation so that complex image phenomenon can be learned easily from a training set. The technique is used to segment cardiac ultrasound images sequences which have substantial spatio-temporal contrast variation that is cumbersome to model. Experimental results show that the method gives accurate segmentations of the endocardium in spite of the contrast variation.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"74 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114099009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Biometric data hiding: A 3 factor authentication approach to verify identity with a single image using steganography, encryption and matching","authors":"Neha Agrawal, M. Savvides","doi":"10.1109/CVPRW.2009.5204308","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204308","url":null,"abstract":"Digital Steganography exploits the use of host data to hide a piece of information in such a way that it is imperceptible to a human observer. Its main objectives are imperceptibility, robustness and high payload. DCT Domain message embedding in Spread Spectrum Steganography describes a novel method of using redundancy in DCT coefficients. We improved upon the method of DCT embedding by using the sign of the DCT coefficients to get better accuracy of retrieved data and more robustness under channel attacks like channel noise and JPEG compression artifacts while maintaining the visual imperceptibility of cover image, and even extending the method further to obtain higher payloads. We also apply this method for secure biometric data hiding, transmission and recovery. We hide iris code templates and fingerprints in the host image which can be any arbitrary image, such as face biometric modality and transmit the so formed imperceptible Stego-Image securely and robustly for authentication, and yet obtain perfect reconstruction and classification of iris codes and retrieval of fingerprints at the receiving end without any knowledge of the cover image i.e. a blind method of steganography, which in this case is used to hide biometric template in another biometric modality.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122405120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Square Loss based regularized LDA for face recognition using image sets","authors":"Yanlin Geng, Caifeng Shan, Pengwei Hao","doi":"10.1109/CVPRW.2009.5204307","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204307","url":null,"abstract":"In this paper, we focus on face recognition over image sets, where each set is represented by a linear subspace. Linear Discriminant Analysis (LDA) is adopted for discriminative learning. After investigating the relation between regularization on Fisher Criterion and Maximum Margin Criterion, we present a unified framework for regularized LDA. With the framework, the ratio-form maximization of regularized Fisher LDA can be reduced to the difference-form optimization with an additional constraint. By incorporating the empirical loss as the regularization term, we introduce a generalized Square Loss based Regularized LDA (SLR-LDA) with suggestion on parameter setting. Our approach achieves superior performance to the state-of-the-art methods on face recognition. Its effectiveness is also evidently verified in general object and object category recognition experiments.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"287 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131425184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An alignment based similarity measure for hand detection in cluttered sign language video","authors":"Ashwin Thangali, S. Sclaroff","doi":"10.1109/CVPRW.2009.5204266","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204266","url":null,"abstract":"Locating hands in sign language video is challenging due to a number of factors. Hand appearance varies widely across signers due to anthropometric variations and varying levels of signer proficiency. Video can be captured under varying illumination, camera resolutions, and levels of scene clutter, e.g., high-res video captured in a studio vs. low-res video gathered by a Web cam in a user's home. Moreover, the signers' clothing varies, e.g., skin-toned clothing vs. contrasting clothing, short-sleeved vs. long-sleeved shirts, etc. In this work, the hand detection problem is addressed in an appearance matching framework. The histogram of oriented gradient (HOG) based matching score function is reformulated to allow non-rigid alignment between pairs of images to account for hand shape variation. The resulting alignment score is used within a support vector machine hand/not-hand classifier for hand detection. The new matching score function yields improved performance (in ROC area and hand detection rate) over the vocabulary guided pyramid match kernel (VGPMK) and the traditional, rigid HOG distance on American Sign Language video gestured by expert signers. The proposed match score function is computationally less expensive (for training and testing), has fewer parameters and is less sensitive to parameter settings than VGPMK. The proposed detector works well on test sequences from an inexpert signer in a non-studio setting with cluttered background.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126974158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. M. Johnston, N.A. Mould, J. Havlicek, Guoliang Fan
{"title":"Dual domain auxiliary particle filter with integrated target signature update","authors":"C. M. Johnston, N.A. Mould, J. Havlicek, Guoliang Fan","doi":"10.1109/CVPRW.2009.5204143","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204143","url":null,"abstract":"For the first time, we formulate an auxiliary particle filter jointly in the pixel domain and modulation domain for tracking infrared targets. This dual domain approach provides an information rich image representation comprising the pixel domain frames acquired directly from an imaging infrared sensor as well as 18 amplitude modulation functions obtained through a multicomponent AM-FM image analysis. The new dual domain auxiliary particle filter successfully tracks all of the difficult targets in the well-known AMCOM closure sequences in terms of both centroid location and target magnification. In addition, we incorporate the template update procedure into the particle filter formulation to extend previously studied dual domain track consistency checking mechanism far beyond the normalized cross correlation (NCC) trackers of the past by explicitly quantifying the differences in target signature evolution between the modulation and pixel domains. Experimental results indicate that the dual domain auxiliary particle filter with integrated target signature update provides a significant performance advantage relative to several recent competing algorithms.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121749060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast features for time constrained object detection","authors":"G. Overett, L. Petersson","doi":"10.1109/CVPRW.2009.5204293","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204293","url":null,"abstract":"This paper concerns itself with the development and design of fast features suitable for time constrained object detection. Primarily we consider three aspects of feature design; the form of the precomputed datatype (e.g. the integral image), the form of the features themselves (i.e. the measurements made of an image), and the models/weak- learners used to construct weak classifiers (class, non-class statistics). The paper is laid out as a guide to feature designers, demonstrating how appropriate choices in combining the above three characteristics can prevent bottlenecks in the run-time evaluation of classifiers. This leads to reductions in the computational time of the features themselves and, by providing more discriminant features, reductions in the time taken to reach specific classification error rates. Results are compared using variants of the well known Haar-like feature types, Rectangular Histogram of Oriented Gradient (RHOG) features and a special set of Histogram of Oriented Gradient features which are highly optimized for speed. Experimental results suggest the adoption of this set of features for time-critical applications. Time-constrained comparisons are presented using pedestrian and road sign detection problems. Comparison results are presented on time-error plots, which are a replacement of the traditional ROC performance curves.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134033434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sergio Escalera, Eloi Puertas, P. Radeva, O. Pujol
{"title":"Multi-modal laughter recognition in video conversations","authors":"Sergio Escalera, Eloi Puertas, P. Radeva, O. Pujol","doi":"10.1109/CVPRW.2009.5204268","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204268","url":null,"abstract":"Laughter detection is an important area of interest in the Affective Computing and Human-computer Interaction fields. In this paper, we propose a multi-modal methodology based on the fusion of audio and visual cues to deal with the laughter recognition problem in face-to-face conversations. The audio features are extracted from the spectogram and the video features are obtained estimating the mouth movement degree and using a smile and laughter classifier. Finally, the multi-modal cues are included in a sequential classifier. Results over videos from the public discussion blog of the New York Times show that both types of features perform better when considered together by the classifier. Moreover, the sequential methodology shows to significantly outperform the results obtained by an Adaboost classifier.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134481443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Songyuan Tang, Yong Fan, Hongtu Zhu, P. Yap, Wei Gao, Weili Lin, D. Shen
{"title":"Regularization of diffusion tensor field using coupled robust anisotropic diffusion filters","authors":"Songyuan Tang, Yong Fan, Hongtu Zhu, P. Yap, Wei Gao, Weili Lin, D. Shen","doi":"10.1109/CVPRW.2009.5204342","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204342","url":null,"abstract":"This paper presents a method to simultaneously regularize diffusion weighted images and their estimated diffusion tensors, with the goal of suppressing noise and restoring tensor information. We enforce a data fidelity constraint, using coupled robust anisotropic diffusion filters, to ensure consistency of the restored diffusion tensors with the regularized diffusion weighted images. The filters are designed to take advantage of robust statistics and to be adopted to the anisotropic nature of diffusion tensors, which can effectively keep boundaries between piecewise constant regions in the tensor volume and also the diffusion weighted images during the regularized process. To facilitate Euclidean operations on the diffusion tensors, log-Euclidean metrics are adopted when performing the filtering. Experimental results on simulated and real image data demonstrate the effectiveness of the proposed method.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134548149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}