{"title":"Object classification with multi-scale autoconvolution","authors":"Esa Rahtu, J. Heikkilä","doi":"10.1109/ICPR.2004.1334463","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334463","url":null,"abstract":"This paper assesses the recently proposed affine invariant image transform called a multi-scale autoconvolution (MSA) in some practical object classification problems. A classification framework based on the MSA and support vector machines is introduced. As shown by the comparison with another affine invariant technique, it appears that this new technique provides a good basis for problems where the disturbances in classified objects can be approximated with spatial affine transformation. The paper also introduces a new property clarifying the parameter selection in the multi-scale autoconvolution.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125502581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Indexing with musical events and its application to content-based music identification","authors":"Sheng Gao, Chin-Hui Lee, Q. Tian","doi":"10.1109/ICPR.2004.1334660","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334660","url":null,"abstract":"In this paper a musical event based indexing approach is proposed and its application to content-based music identification is studied. The events, which function as term words used in text retrieval or basic speech units in speech recognition, are inferred using an unsupervised learning algorithm. Its differences with the existing methods are in that the learned low-level musicology knowledge and model selection technique are exploited to extract musical events. Our experimental analyses on a task of music identification demonstrate that the proposed indexing method is efficient, compact and robust. Using a collection of 20-second query segments on the evaluation set, the equal error rate reaches 1.57%. For applications that demand fewer false alarms, we could operate the system at a reduced false acceptance rate of 0.57% while increasing the false rejection rate to 4.58%.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114961315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel pattern recognition computations within a wireless sensor network","authors":"Asad I. Khan, P. Mihailescu","doi":"10.1109/ICPR.2004.1334332","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334332","url":null,"abstract":"The computational properties of a wireless sensor network (WSN) have been investigated by implementing a fully distributed pattern recognition algorithm within the network. It is shown that the set up allows a physical object to develop a capability, which to some extent may be considered similar to our sense of touch, with the WSN acting as an artificial nervous system in this regard. The effectiveness of the algorithm is inspected by comparing the outputs from the sensors with the stress patterns generated through a simple finite element model and then stored within the network. It is shown that the test object could successfully differentiate between its internal stress states resulting from the changes to its external loading conditions. Suitability of the algorithm is discussed with respect to the data storage requirement per node of the WSN.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115349584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of model-based interactive flower recognition","authors":"Jie Zou, G. Nagy","doi":"10.1109/ICPR.2004.1334185","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334185","url":null,"abstract":"We introduce the concept of computer assisted visual interactive recognition (CAVIAR). In CAVIAR, a parameterized geometrical model serves as the human-computer communication channel. We implemented a flower recognition system and evaluated it on 30 inexperienced subjects. Major conclusions include: 1) the accuracy of the CAVIAR system is much higher than that of the machine alone; 2) its recognition time is much lower than that of the human alone; 3) it can be initialized with as few as one training sample per class and still achieve high accuracy; 4) it demonstrates a self-learning ability, which suggests that instead of initializing the CAVIAR system with many training samples, we can trust the system's self-learning ability.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115990690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Binary image transformation using two-dimensional chaotic maps","authors":"F. Belkhouche, U. Qidwai, I. Gokcen, D. Joachim","doi":"10.1109/ICPR.2004.1333899","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1333899","url":null,"abstract":"We present an algorithm for binary image transformation using chaotic maps. Because of its random-like behavior, chaos is a good candidate for encryption. We show that a two-dimensional discrete time dynamical system with one positive Lyapunov exponent allows the transformation of the image in an unpredictable manner. The suggested algorithm acts on the pixel position, where the diffusion property resulting from the sensitivity to the initial states is used to accomplish the transformation in a random-like way. The suggested algorithm uses three types of keys: initial state, external parameters and the number of iterations. Using the so-called Henon map as an example, we show that the algorithm produces almost uncorrelated images even when the keys are slightly changed, making it an attractive and fast method for image encryption.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116125424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hui Kong, E. Teoh, Jian-Gang Wang, R. Venkateswarlu
{"title":"Coplanar light sweep-surface supported uncalibrated photometric stereo","authors":"Hui Kong, E. Teoh, Jian-Gang Wang, R. Venkateswarlu","doi":"10.1109/ICPR.2004.1333713","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1333713","url":null,"abstract":"In lambertian uncalibrated photometric stereo (VPS), the object surface albedo and normals, the lighting directions and intensities are determined up to an arbitrary invertible matrix. In this paper, a novel method is proposed to reduce such an ambiguity. With the support of a coplanar light sweep-surface (CLSS), some key normals are determined relative to a fiducial normal which is estimated by the CLSS. Thus the arbitrary transformation is reduced to invertible matrix up to a rotation and a scale. The rotation can be solved by controlling the direction of camera, the resulting ambiguity transformation can be finally determined up to a global scale.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122650475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shooting the lecture scene using computer-controlled cameras based on situation understanding and evaluation of video images","authors":"M. Onishi, K. Fukunaga","doi":"10.1109/ICPR.2004.1334333","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334333","url":null,"abstract":"We propose a computer-controlled camera work that shoots object scenes to model the professional cameramen's work and selects the best image among plural video images as a switcher. We apply this system to a shooting of a lecture scene. In the first image, our system estimates a teacher's action based on the features of a teacher and a blackboard. In the next, each camera is directed to a shooting area based on the teacher's action, automatically. In the last, this system selects the best image among plural images under the evaluation rule. Moreover, we have tried experiments of shooting lecture scene and have confirmed the effectiveness of our approach.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122576065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid watermarking scheme for H.264/AVC video","authors":"G. Qiu, P. Marziliano, A. Ho, D. He, Qibin Sun","doi":"10.1109/ICPR.2004.1333909","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1333909","url":null,"abstract":"A novel H.264/AVC watermarking method is proposed in this paper. By embedding the robust watermark into DCT domain and the fragile watermark into motion vectors respectively, the proposed method can jointly achieve both copyright protection and authentication. Our scheme outperforms other video watermarking schemes on higher watermarking capacity especially in lower compression bit-rates. Furthermore, being well aligned with Lagrangian optimization for mode choice featured in H.264/AVC, the proposed scheme only introduces small distortions into the video content. Experimental results also demonstrate that the proposed solution is very computationally efficient during watermark extraction.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114179358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bayesian framework for robust human detection and occlusion handling human shape model","authors":"H. Eng, Junxian Wang, A. H. Kam, W. Yau","doi":"10.1109/ICPR.2004.19","DOIUrl":"https://doi.org/10.1109/ICPR.2004.19","url":null,"abstract":"One challenging aspect of automated surveillance for real environments is the occurrences of various difficult scenarios brought about by practical unconstrained settings. We address foreground detection for automated surveillance under the following challenging situations: i) foregrounds being partially hidden due to close similarities to the background, and ii) foregrounds representing multiple objects being inseparable, forming a large contiguous blob due to occlusion. To build a robust system, we present a new foreground detection framework based on Bayesian formulation, comprising both bottom-up and top-down approaches. We first propose a region-based background subtraction and a localized spatial segmentation scheme as the bottom-up steps for foreground detection. We then incorporate a human shape model as the top-down step for foreground validation and occlusion handling. Segmentation is obtained when a maximum posteriori value is found, corresponding to the best description about foregrounds given by the approach. Such integration of bottom-up and top-down approaches leads directly to more robust performance in handling challenging situations within hostile real environments. Promising results are obtained when the algorithm is tested on real video sequences captured from a live surveillance system that operates at a public outdoor swimming pool.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114453949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid SOM-SVM method for analyzing zebra fish gene expression","authors":"Wu Wei, Liu Xin, Xu Min, Peng Jinrong, R. Setiono","doi":"10.1109/ICPR.2004.1334191","DOIUrl":"https://doi.org/10.1109/ICPR.2004.1334191","url":null,"abstract":"Microarray technology can be employed to quantitatively measure the expression of thousands of genes in a single experiment. It has become one of the main tools for global gene expression analysis in molecular biology research in recent years. The large amount of expression data generated by this technology makes the study of certain complex biological problems possible, and machine learning methods are expected to play a crucial role in the analysis process. We present our results from integrating a self-organizing maps (SOM) and a support vector machine (SVM) for the analysis of the various functions of zebra fish genes based on their expression. We discuss how SOM can be used as a data-filtering tool to improve the classification performance of the SVM on this data set.","PeriodicalId":335842,"journal":{"name":"Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114502240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}