{"title":"Theme-Based Multi-class Object Recognition and Segmentation","authors":"Shilin Wu, Jiajia Geng, F. Zhu","doi":"10.1109/ICPR.2010.738","DOIUrl":"https://doi.org/10.1109/ICPR.2010.738","url":null,"abstract":"In this paper, we propose a new theme-based CRF model and investigate its performance on class based pixel-wise segmentation of images. By including the theme of an image, we also propose a new texture-environment potential to represent texture environment of a pixel, which alone gives satisfactory recognition results. The pixel-wise segmentation accuracy is remarkably improved by introducing texture potential. We compare our results to recent published results on the MSRC 21-class database and show that our theme-based CRF model significantly outperforms the current state-of-the-art. Especially, by assigning a theme for each image, our model obtains greatly improved accuracy of structured classes with high visual variability and fewer training examples, the accuracy of which is very low in most related works.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114265987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sergio Escalera, D. Masip, Eloi Puertas, P. Radeva, O. Pujol
{"title":"Adding Classes Online in Error Correcting Output Codes Framework","authors":"Sergio Escalera, D. Masip, Eloi Puertas, P. Radeva, O. Pujol","doi":"10.1109/ICPR.2010.722","DOIUrl":"https://doi.org/10.1109/ICPR.2010.722","url":null,"abstract":"This article proposes a general extension of the Error Correcting Output Codes (ECOC) framework to the online learning scenario. As a result, the final classifier handles the addition of new classes independently of the base classifier used. Validation on UCI database and two real machine vision applications show that the online problem-dependent ECOC proposal provides a feasible and robust way for handling new classes using any base classifier.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114468436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Restoration of Scratch in Old Archive","authors":"Kyung-tai Kim, Byunggeun Kim, Eun Yi Kim","doi":"10.1109/ICPR.2010.122","DOIUrl":"https://doi.org/10.1109/ICPR.2010.122","url":null,"abstract":"This paper presents scratch restoration method that can deal with scratches of various lengths and widths in old film. The proposed method consists of detection and reconstruction. The detection is performed using texture and shape properties of the scratches: first, each pixel is classified as scratches and non-scratches using a neural network (NN)-based texture classifier, and then some false alarms are removed by shape filtering. Thereafter, the detected region is reconstructed. Here, the reconstruction is formulated as energy minimization problem, thus genetic algorithm is used as optimization algorithm. The experimental result with well-known old films showed the effectiveness of the proposed method.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121998175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast Logo Detection and Recognition in Document Images","authors":"Zhe Li, Matthias Schulte-Austum, M. Neschen","doi":"10.1109/ICPR.2010.665","DOIUrl":"https://doi.org/10.1109/ICPR.2010.665","url":null,"abstract":"The scientific significance of automatic logo detection and recognition is more and more growing because of the increasing requirements of intelligent document image analysis and retrieval. In this paper, we introduce a system architecture which is aiming at segmentation-free and layout-independent logo detection and recognition. Along with the unique logo feature design, a novel way to ensure the geometrical relationships among the features, and different optimizations in the recognition process, this system can achieve improvements concerning both the recognition performance and the running time. The experimental results on several sets of real-word documents demonstrate the effectiveness of our approach.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122023047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"De-noising of SRµCT Fiber Images by Total Variation Minimization","authors":"Joakim Lindblad, Natasa Sladoje, T. Lukić","doi":"10.1109/ICPR.2010.1116","DOIUrl":"https://doi.org/10.1109/ICPR.2010.1116","url":null,"abstract":"SRµCT images of paper and pulp fiber materials are characterized by a low signal to noise ratio. De-noising is therefore a common preprocessing step before segmentation into fiber and background components. We suggest a de-noising method based on total variation minimization using a modified Spectral Conjugate Gradient algorithm. Quantitative evaluation performed on synthetic 3D data and qualitative evaluation on real 3D paper fiber data confirm appropriateness of the suggested method for the particular application.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122059507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Median Graph Shift: A New Clustering Algorithm for Graph Domain","authors":"Salim Jouili, S. Tabbone, V. Lacroix","doi":"10.1109/ICPR.2010.238","DOIUrl":"https://doi.org/10.1109/ICPR.2010.238","url":null,"abstract":"In the context of unsupervised clustering, a new algorithm for the domain of graphs is introduced. In this paper, the key idea is to adapt the mean-shift clustering and its variants proposed for the domain of feature vectors to graph clustering. These algorithms have been applied successfully in image analysis and computer vision domains. The proposed algorithm works in an iterative manner by shifting each graph towards the median graph in a neighborhood. Both the set median graph and the generalized median graph are tested for the shifting procedure. In the experiment part, a set of cluster validation indices are used to evaluate our clustering algorithm and a comparison with the well-known Kmeans algorithm is provided.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116752594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Lefakis, H. Wildenauer, Manuel Pascual Garcia-Tubio, L. Szumilas
{"title":"Boosted Edge Orientation Histograms for Grasping Point Detection","authors":"L. Lefakis, H. Wildenauer, Manuel Pascual Garcia-Tubio, L. Szumilas","doi":"10.1109/ICPR.2010.990","DOIUrl":"https://doi.org/10.1109/ICPR.2010.990","url":null,"abstract":"In this paper, we describe a novel algorithm for the detection of grasping points in images of previously unseen objects. A basic building block of our approach is the use of a newly devised descriptor, representing semi-local grasping point shape by the use edge orientation histograms. Combined with boosting, our method learns discriminative grasp point models for new objects from a set of annotated real-world images. The method has been extensively evaluated on challenging images of real scenes, exhibiting largely varying characteristics concerning illumination conditions, scene complexity, and viewpoint. Our experiments show that the method works in a stable manner and that its performance compares favorably to the state-of-the-art.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117026074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexey Karpov, A. Ronzhin, I. Kipyatkova, A. Ronzhin, L. Akarun
{"title":"Multimodal Human Computer Interaction with MIDAS Intelligent Infokiosk","authors":"Alexey Karpov, A. Ronzhin, I. Kipyatkova, A. Ronzhin, L. Akarun","doi":"10.1109/ICPR.2010.941","DOIUrl":"https://doi.org/10.1109/ICPR.2010.941","url":null,"abstract":"In this paper, we present an intelligent information kiosk called MIDAS (Multimodal Interactive-Dialogue Automaton for Self-service), including its hardware and software architecture, stages of deployment of speech recognition and synthesis technologies. MIDAS uses the methodology Wizard of Oz (WOZ) that allows an expert to correct speech recognition results and control the dialogue flow. User statistics of the multimodal human computer interaction (HCI) have been analyzed for the operation of the kiosk in the automatic and automated modes. The infokiosk offers information about the structure and staff of laboratories, the location and phones of departments and employees of the institution. The multimodal user interface is provided with a touch screen, natural speech input and head and manual gestures, both for ordinary and physically handicapped users.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117098268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Person-Specific Face Shape Estimation under Varying Head Pose from Single Snapshots","authors":"F. Dornaika, B. Raducanu","doi":"10.1109/ICPR.2010.853","DOIUrl":"https://doi.org/10.1109/ICPR.2010.853","url":null,"abstract":"This paper presents a new method for person-specific face shape estimation under varying head pose of a previously unseen person from a single image. We describe a featureless approach based on a deformable 3D model and a learned face subspace. The proposed approach is based on maximizing a likelihood measure associated with a learned face subspace, which is carried out by a stochastic and genetic optimizer. We conducted the experiments on a subset of Honda Video Database showing the feasibility and robustness of the proposed approach. For this reason, our approach could lend itself nicely to complex frameworks involving 3D face tracking and face gesture recognition in monocular videos.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129551301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying Error-Correcting Output Coding to Enhance Convolutional Neural Network for Target Detection and Pattern Recognition","authors":"Huiqun Deng, G. Stathopoulos, C. Suen","doi":"10.1109/ICPR.2010.1043","DOIUrl":"https://doi.org/10.1109/ICPR.2010.1043","url":null,"abstract":"This paper views target detection and pattern recognition as a kind of communications problem and applies error-correcting coding to the outputs of a convolutional neural network to improve the accuracy and reliability of detection and recognition of targets. The outputs of the convolutional neural network are designed according to codewords with maximum Hamming distances. The effects of the codewords on the performance of the convolutional neural network in target detection and recognition are then investigated. Images of hand-written digits and printed English letters and symbols are used in the experiments. Results show that error-correcting output coding provides the neural network with more reliable decision rules and enables it to perform more accurate and reliable detection and recognition of targets. Moreover, our error-correcting output coding can reduce the number of neurons required, which is highly desirable in efficient implementations.","PeriodicalId":309591,"journal":{"name":"2010 20th International Conference on Pattern Recognition","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129565713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}