{"title":"Computational analysis of mannerism gestures","authors":"K. Kahol, P. Tripathi, S. Panchanathan","doi":"10.1109/ICPR.2004.1334685","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334685","url":null,"abstract":"Humans perform various gestures in everyday life. While some of these gestures are typically well understood amongst a community (such as \"hello\" and \"goodbye\"), many gestures and movement are typical of an individual's style, body language or mannerisms. Examples of such gestures include the manner is which a person laughs, hand gestures used to converse or the manner in which a person performs a dance sequence. Individuals possess a large vocabulary of mannerism gestures. Conventional modeling of gestures as a series of poses for the purpose of automatically recognizing gestures is inadequate for modeling mannerism gestures. In this paper we propose a novel method to model mannerism gestures. Gestures are modeled as a sequence of events that take place within the segments and the joints of the human body. Each gesture is then represented in an event-driven coupled hidden Markov model (HMM) as a sequence of events, occurring in the various segments and joints. The inherent advantage of using an event-driven coupled-HMM (instead of a pose-driven HMM) is that there is no need to add states to represent more complex gestures or increase the states for addition of another individual. When this model was tested on a library of 185 gestures, created by 7 subjects, the algorithm achieved an average recognition accuracy of 90.2%.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121636359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generating omnifocus images using graph cuts and a new focus measure","authors":"N. Xu, K. Tan, H. Arora, N. Ahuja","doi":"10.1109/ICPR.2004.1333868","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1333868","url":null,"abstract":"We discuss how to generate omnifocus images from a sequence of different focal setting images. We first show that the existing focus measures would encounter difficulty when detecting which frame is most focused for pixels in the regions between intensity edges and uniform areas. Then we propose a new focus measure that could be used to handle this problem. In addition, after computing focus measures for every pixel in all images, we construct a three dimensional (3D) node-capacitated graph and apply a graph cut based optimization method to estimate a spatio-focus surface that minimizes the summation of the new focus measure values on this surface. An omnifocus image can be directly generated from this minimal spatio-focus surface. Experimental results with simulated and real scenes are provided.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126457230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real time facial expression recognition with AdaBoost","authors":"Yubo Wang, H. Ai, Bo Wu, Chang Huang","doi":"10.1109/ICPR.2004.1334680","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334680","url":null,"abstract":"In this paper, we propose a novel method for facial expression recognition. The facial expression is extracted from human faces by an expression classifier that is learned from boosting Haar feature based look-up-table type weak classifiers. The expression recognition system consists of three modules, face detection, facial feature landmark extraction and facial expression recognition. The implemented system can automatically recognize seven expressions in real time that include anger, disgust, fear, happiness, neutral, sadness and surprise. Experimental results are reported to show its potential applications in human computer interaction.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"516 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133305453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cancelable biometric filters for face recognition","authors":"M. Savvides, B. Kumar, P. Khosla","doi":"10.1109/ICPR.2004.1334679","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334679","url":null,"abstract":"In this paper, we address the issue of producing cancelable biometric templates; a necessary feature in the deployment of any biometric authentication system. We propose a novel scheme that encrypts the training images used to synthesize the single minimum average correlation energy filter for biometric authentication. We show theoretically that convolving the training images with any random convolution kernel prior to building the biometric filter does not change the resulting correlation output peak-to-sidelobe ratios, thus preserving the authentication performance. However, different templates can be obtained from the same biometric by varying the convolution kernels thus enabling the cancelability of the templates. We evaluate the proposed method using the illumination subset of the CMU pose, illumination, and expressions (PIE) face dataset. Our proposed method is very interesting from a pattern recognition theory point of view, as we are able to 'encrypt' the data and perform recognition in the encrypted domain that performs as well as the unencrypted case, regardless of the encryption kernel used; we show analytically that the recognition performance remains invariant to the proposed encryption scheme, while retaining the desired shift-invariance property of correlation filters.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123014560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WillHunter: interactive image retrieval with multilevel relevance","authors":"Hong Wu, Hanqing Lu, Songde Ma","doi":"10.1109/ICPR.2004.1334430","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334430","url":null,"abstract":"Relevance feedback has become a key component in CBIR system. Although most current relevance feedback approaches are based on dichotomous relevance measurement, this coarse measurement is a distortion of the reality. We study relevance feedback with multi-level relevance measurement to better identify the u ser needs and preferences. To validate the use of multi-level relevance measurement and our relevance feedback algorithm, we developed a CBIR prototype system - WillHunter. There are two novelties in our system, one is our SVM-based fast learning algorithm; another is the easy-to-use graphical user interface, especially the relevance-measuring instrument. Not only experiments are conducted to assess the algorithm, but also usability study is carried out to evaluate the user interface.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114595674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Matsui, S. Clippingdale, Fumiki Uzawa, Takashi Matsumoto
{"title":"Bayesian face recognition using a Markov chain Monte Carlo method","authors":"A. Matsui, S. Clippingdale, Fumiki Uzawa, Takashi Matsumoto","doi":"10.1109/ICPR.2004.1334678","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334678","url":null,"abstract":"A new algorithm is proposed for face recognition by a Bayesian framework. Posterior distributions are computed by Markov chain Monte Carlo (MCMC). Face features used in the paper are those used in our previous work based on the elastic graph matching method. While our previous method attempts to optimize facial feature point positions so as to maximize a similarity function between each model and face region in the input sequence, the proposed approach evaluates posterior distributions of models conditioned on the input sequence. Experimental results show a rather dramatic improvement in robustness. The proposed algorithm eliminates almost all identification errors on sequences showing individuals talking, and reduces identification errors by more than 90% on sequences showing individuals smiling although such data was not used in training.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128955022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Who are you? [face recognition]","authors":"M. C. Santana, E. Grosso, O. Déniz-Suárez","doi":"10.1109/ICPR.2004.1334683","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334683","url":null,"abstract":"Most automatic recognition systems are focused on recognizing, given a single mug-shot of an individual, any new image of that individual. Most verification systems are designed to authenticate an identity provided by the user. However, the previous work rarely focus on the problem of detecting when a new individual, i.e., an unknown one, is present. The goal of the work presented in this paper deals with the possibility of providing the system with basic tools to detect when a new individual starts an interactive session, in order to allow the system to add or improve an identity model in the database. Experiments carried out with a set of 36 different individuals show promising results.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122916917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Region based /spl alpha/-semantics graph driven image retrieval","authors":"Ruofei Zhang, Sandeep Khanzode, Zhongfei Zhang","doi":"10.1109/ICPR.2004.1333920","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1333920","url":null,"abstract":"This work is about content based image database retrieval, focusing on developing a classification based methodology to address semantics-intensive image retrieval. With self organization map based image feature grouping, a visual dictionary is created for color, texture, and shape feature attributes, respectively. Labeling each training image with the keywords in the visual dictionary, a classification tree is built. Based on the statistical properties of the feature space we define a structure, called /spl alpha/-semantics graph, to discover the hidden semantic relationships among the semantic repositories embodied in the image database. With the /spl alpha/-semantics graph, each semantic repository is modeled as a unique fuzzy set to explicitly address the semantic uncertainty and the semantic overlap existing among the repositories in the feature space. A retrieval algorithm combining the built classification tree with the developed fuzzy set models to deliver semantically relevant image retrieval is provided. The experimental evaluations have demonstrated that the proposed approach models the semantic relationships effectively and outperforms a state-of-the-art content based image retrieval system in the literature both in effectiveness and efficiency.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130786466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient multimodal features for automatic soccer highlight generation","authors":"K. Wan, Changsheng Xu","doi":"10.1109/ICPR.2004.1334691","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334691","url":null,"abstract":"We describe efficient audio/visual features and their multimodal combination to detect highlights in soccer video. A novel audio feature first detects dominant speech portions in the commentary coincident with segments of high excitement in the game. Verification is then performed in the visual domain by detecting the presence of goal-mouth in the current shot and a high frequency of camera shot change in the subsequent shots. The cascaded process filters spurious candidate highlights from the noisy audio. The impressive results obtained on a large video test-set belie the technical simplicity in the system, which may now enable rapid generation of highlights on low-cost devices such as household set-top-boxes.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133824969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gesture tracking and recognition for lecture video editing","authors":"Feng Wang, C. Ngo, T. Pong","doi":"10.1109/ICPR.2004.1334682","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334682","url":null,"abstract":"This paper presents a gesture based driven approach for video editing. Given a lecture video, we adopt novel approaches to automatically detect and synchronize its content with electronic slides. The gestures in each synchronized topic (or shot) are then tracked and recognized continuously. By registering shots and slides and recovering their transformation, the regions where the gestures take place can be known. Based on the recognized gestures and their registered positions, the information in slides can be seamlessly extracted, not only to assist video editing, but also to enhance the quality of original lecture video.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"16 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133140217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}