{"title":"Automatic Frequency Band Selection for Illumination Robust Face Recognition","authors":"H. K. Ekenel, R. Stiefelhagen","doi":"10.1109/ICPR.2010.658","DOIUrl":"https://doi.org/10.1109/ICPR.2010.658","url":null,"abstract":"Varying illumination conditions cause a dramatic change in facial appearance that leads to a significant drop in face recognition algorithms' performance. In this paper, to overcome this problem, we utilize an automatic frequency band selection scheme. The proposed approach is incorporated to a local appearance-based face recognition algorithm, which employs discrete cosine transform (DCT) for processing local facial regions. From the extracted DCT coefficients, the approach determines to the ones that should be used for classification. Extensive experiments conducted on the extended Yale face database B have shown that benefiting from frequency information provides robust face recognition under changing illumination conditions.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130385924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visibility of Multiple Cameras in a Scene with Unknown Geometry","authors":"Liuxin Zhang, Yunde Jia","doi":"10.1109/ICPR.2010.883","DOIUrl":"https://doi.org/10.1109/ICPR.2010.883","url":null,"abstract":"In this paper, we investigate the problem of determining the visible regions of multiple cameras in a 3D scene without a priori knowledge of the scene geometry. Our approach is based on a variational energy functional where both the unresolved visibility information of multiple cameras and the unknown scene geometry are included. We cast visibility estimation and scene geometry reconstruction as an optimization of the variational energy functional amenable for minimization with the Euler-Lagrange driven evolution. Starting from any initial value, the accurate visibility of multiple cameras as well as the true scene geometry can be obtained at the end of the evolution. Experimental results show the validity of our approach.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132997441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Scene Semantics Using Fiedler Embedding","authors":"Jingen Liu, Saad Ali","doi":"10.1109/ICPR.2010.885","DOIUrl":"https://doi.org/10.1109/ICPR.2010.885","url":null,"abstract":"We propose a framework to learn scene semantics from surveillance videos. Using the learnt scene semantics, a video analyst can efficiently and effectively retrieve the hidden semantic relationship between homogeneous and heterogeneous entities existing in the surveillance system. For learning scene semantics, the algorithm treats different entities as nodes in a graph, where weighted edges between the nodes represent the \"initial\" strength of the relationship between entities. The graph is then embedded into a k-dimensional space by Fiedler Embedding.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"624 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132857769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Traj Align: A Method for Precise Matching of 3-D Trajectories","authors":"Z. Aung, Kelvin Sim, W. Ng","doi":"10.1109/ICPR.2010.930","DOIUrl":"https://doi.org/10.1109/ICPR.2010.930","url":null,"abstract":"Matching two 3-D trajectories is an important task in a number of applications. The trajectory matching problem can be solved by aligning the two trajectories and taking the alignment score as their similarity measurement. In this paper, we propose a new method called \"TrajAlign\" (Trajectory Alignment). It aligns two trajectories by means of aligning their representative distance matrices. Experimental results show that our method is significantly more precise than the existing state-of-the-art methods. While the existing methods can provide correct answers in only up to 67% of the test cases, TrajAlign can offer correct results in 79% (i.e. 12% more) of the test cases, TrajAlign is also computationally inexpensive, and can be used practically for applications that demand efficiency.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121926477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Action Detection in Crowded Videos Using Masks","authors":"Ping Guo, Z. Miao","doi":"10.1109/ICPR.2010.436","DOIUrl":"https://doi.org/10.1109/ICPR.2010.436","url":null,"abstract":"In this paper, we investigate the task of human action detection in crowded videos. Different from action analysis in clean scenes, action detection in crowded environments is difficult due to the cluttered backgrounds, high densities of people and partial occlusions. This paper proposes a method for action detection based on masks. No human segmentation or tracking technique is required. To cope with the cluttered and crowded backgrounds, shape and motion templates are built and the shape templates are used as masks for feature refining. In order to handle the partial occlusion problem, only the moving body parts in each motion are involved in action training. Experiments using our approach are conducted on the CMU dataset with encouraging results.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125827521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The PAGE (Page Analysis and Ground-Truth Elements) Format Framework","authors":"S. Pletschacher, A. Antonacopoulos","doi":"10.1109/ICPR.2010.72","DOIUrl":"https://doi.org/10.1109/ICPR.2010.72","url":null,"abstract":"There is a plethora of established and proposed document representation formats but none that can adequately support individual stages within an entire sequence of document image analysis methods (from document image enhancement to layout analysis to OCR) and their evaluation. This paper describes PAGE, a new XML-based page image representation framework that records information on image characteristics (image borders, geometric distortions and corresponding corrections, binarisation etc.) in addition to layout structure and page content. The suitability of the framework to the evaluation of entire workflows as well as individual stages has been extensively validated by using it in high-profile applications such as in public contemporary and historical ground-truthed datasets and in the ICDAR Page Segmentation competition series.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114968123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Levelings and Flat Zone Morphology","authors":"F. Meyer","doi":"10.1109/ICPR.2010.388","DOIUrl":"https://doi.org/10.1109/ICPR.2010.388","url":null,"abstract":"Successive levelings are applied on document images. The residues of successive levelings are made of flat zones for which morphological transforms are described.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"450 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123055579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating a Discrete Motion Model into GMM Based Background Subtraction","authors":"Christian Wolf, J. Jolion","doi":"10.1109/ICPR.2010.11","DOIUrl":"https://doi.org/10.1109/ICPR.2010.11","url":null,"abstract":"GMM based algorithms have become the de facto standard for background subtraction in video sequences, mainly because of their ability to track multiple background distributions, which allows them to handle complex scenes including moving trees, flags moving in the wind etc. However, it is not always easy to determine which distributions of the mixture belong to the background and which distributions belong to the foreground, which disturbs the results of the labeling process for each pixel. In this work we tackle this problem by taking the labeling decision together for all pixels of several consecutive frames minimizing a global energy function taking into account spatial and temporal relationships. A discrete approximative optical-flow like motion model is integrated into the energy function and solved with Ishikawa's convex graph cuts algorithm.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123062826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Use of Line Spectral Frequencies for Emotion Recognition from Speech","authors":"E. Bozkurt, E. Erzin, Ç. Erdem, A. Erdem","doi":"10.1109/ICPR.2010.903","DOIUrl":"https://doi.org/10.1109/ICPR.2010.903","url":null,"abstract":"We propose the use of the line spectral frequency (LSF) features for emotion recognition from speech, which have not been been previously employed for emotion recognition to the best of our knowledge. Spectral features such as mel-scaled cepstral coefficients have already been successfully used for the parameterization of speech signals for emotion recognition. The LSF features also offer a spectral representation for speech, moreover they carry intrinsic information on the formant structure as well, which are related to the emotional state of the speaker [4]. We use the Gaussian mixture model (GMM) classifier architecture, that captures the static color of the spectral features. Experimental studies performed over the Berlin Emotional Speech Database and the FAU Aibo Emotion Corpus demonstrate that decision fusion configurations with LSF features bring a consistent improvement over the MFCC based emotion classification rates.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124432025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Encoding Actions via Quantized Vocabulary of Averaged Silhouettes","authors":"Liang Wang, C. Leckie","doi":"10.1109/ICPR.2010.892","DOIUrl":"https://doi.org/10.1109/ICPR.2010.892","url":null,"abstract":"Human action recognition from video clips has received increasing attention in recent years. This paper proposes a simple yet effective method for the problem of action recognition. The method aims to encode human actions using the quantized vocabulary of averaged silhouettes that are derived from space-time windowed shapes and implicitly capture local temporal motion as well as global body shape. Experimental results on the publicly available Weizmann dataset have demonstrated that, despite its simplicity, our method is effective for recognizing actions, and is comparable to other state-of-the-art methods.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121814991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}