{"title":"Color face recognition: A multilinear-PCA approach combined with Hidden Markov Models","authors":"D. Alexiadis, Dimitrios P. Glaroudis","doi":"10.5220/0003445501130119","DOIUrl":"https://doi.org/10.5220/0003445501130119","url":null,"abstract":"Hidden Markov Models (HMMs) have been successfully applied to the face recognition problem. However, existing HMM-based techniques use feature (observation) vectors that are extracted only from the images' luminance component, while it is known that color provides significant information. In contrast to the classical PCA approach, Multilinear PCA (MPCA) seems to be an appropriate scheme for dimensionality reduction and feature extraction from color images, handling the color channels in a natural, “holistic” manner. In this paper, we propose an MPCA-based approach for color face recognition, that exploits the strengths of HMMs as classifiers. The proposed methodology was tested on three publicly available color databases and produced high recognition rates, compared to existing HMM-based methodologies.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129275023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image denoising based on Laplace distribution with local parameters in Lapped Transform domain","authors":"V. K. Nath, A. Mahanta","doi":"10.5220/0003516900670072","DOIUrl":"https://doi.org/10.5220/0003516900670072","url":null,"abstract":"In this paper, we present a new image denoising method based on statistical modeling of Lapped Transform (LT) coefficients. The lapped transform coefficients are first rearranged into wavelet like structure, then the rearranged coefficient subband statistics are modeled in a similar way like wavelet coefficients. We propose to model the rearranged LT coefficients in a subband using Laplace probability density function (pdf) with local variance. This simple distribution is well able to model the locality and the heavy tailed property of lapped transform coefficients. A maximum a posteriori (MAP) estimator using the Laplace probability density function (pdf) with local variance is used for the estimation of noise free lapped transform coefficients. Experimental results show that the proposed low complexity image denoising method outperforms several wavelet based image denoising techniques and also outperforms two existing LT based image denoising schemes. Our main contribution in this paper is to use the local Laplace prior for statistical modeling of LT coefficients and to use MAP estimation procedure with this proposed prior to restore the noisy image LT coefficients.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126532421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context based watermarking of secure JPEG-LS images","authors":"A. Subramanyam, S. Emmanuel","doi":"10.5220/0003446201610166","DOIUrl":"https://doi.org/10.5220/0003446201610166","url":null,"abstract":"JPEG-LS is generally used to compress bio-medical or high dynamic range images. These compressed images sometime needs to be encrypted for confidentiality. In addition, the secured JPEG-LS compressed images may need to be watermarked to detect copyright violation, track different users handling the image, prove ownership or for authentication purpose. In the proposed technique, watermark is embedded in the context of the compressed image while the Golomb coded bit stream is encrypted. The extraction of watermark can be done during JPEG-LS decoding. The advantage of this watermarking scheme is that the media need not be decompressed or decrypted for embedding watermark thus saving computational complexity while preserving the confidentiality of the media.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"315 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129598443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pablo Revuelta, B. Ruíz-Mezcua, J. M. S. Peña, J. Thiran
{"title":"Stereo vision matching over single-channel color-based segmentation","authors":"Pablo Revuelta, B. Ruíz-Mezcua, J. M. S. Peña, J. Thiran","doi":"10.5220/0003473201260130","DOIUrl":"https://doi.org/10.5220/0003473201260130","url":null,"abstract":"Stereo vision is one of the most important passive methods to extract depth maps. Among them, there are several approaches with advantages and disadvantages. Computational load is especially important in both the block matching and graphical cues approaches. In a previous work, we proposed a region growing segmentation solution to the matching process. In that work, matching was carried out over statistical descriptors of the image regions, commonly referred to as characteristic vectors, whose number is, by definition, lower than the possible block matching possibilities. This first version was defined for gray scale images. Although efficient, the gray scale algorithm presented some important disadvantages, mostly related to the segmentation process. In this article, we present a pre-processing tool to compute gray scale images that maintains the relevant color information, preserving both the advantages of gray scale segmentation and those of color image processing. The results of this improved algorithm are shown and compared to those obtained by the gray scale segmentation and matching algorithm, demonstrating a significant improvement of the computed depth maps.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132438649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manuel Rivas Pérez, A. Linares-Barranco, A. Jiménez-Fernandez, A. C. Balcells, G. Jiménez-Moreno
{"title":"AER spike-processing filter simulator: Implementation of an AER simulator based on cellular automata","authors":"Manuel Rivas Pérez, A. Linares-Barranco, A. Jiménez-Fernandez, A. C. Balcells, G. Jiménez-Moreno","doi":"10.5220/0003525900910096","DOIUrl":"https://doi.org/10.5220/0003525900910096","url":null,"abstract":"Spike-based systems are neuro-inspired circuits implementations traditionally used for sensory systems or sensor signal processing. Address-Event-Representation (AER) is a neuromorphic communication protocol for transferring asynchronous events between VLSI spike-based chips. These neuro-inspired implementations allow developing complex, multilayer, multichip neuromorphic systems and have been used to design sensor chips, such as retinas and cochlea, processing chips, e.g. filters, and learning chips. Furthermore, Cellular Automata (CA) is a bio-inspired processing model for problem solving. This approach divides the processing synchronous cells which change their states at the same time in order to get the solution. This paper presents a software simulator able to gather several spike-based elements into the same workspace in order to test a CA architecture based on AER before a hardware implementation. Furthermore this simulator produces VHDL for testing the AER-CA into the FPGA of the USB-AER AER-tool.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123012124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virginia Fernandez Arguedas, K. Chandramouli, Qianni Zhang, E. Izquierdo
{"title":"Optimal combination of low-level features for surveillance object retrieval","authors":"Virginia Fernandez Arguedas, K. Chandramouli, Qianni Zhang, E. Izquierdo","doi":"10.5220/0003527101870192","DOIUrl":"https://doi.org/10.5220/0003527101870192","url":null,"abstract":"In this paper, a low-level multi-feature fusion based classifier is presented for studying the performance of an object retrieval method from surveillance videos. The proposed retrieval framework exploits the recent developments in evolutionary computation algorithm based on biologically inspired optimisation techniques. The multi-descriptor space is formed with a combination of four MPEG-7 visual features. The proposed approach has been evaluated against kernel machines for objects extracted from AVSS 2007 dataset.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115280710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segmentation of touching Lanna characters","authors":"Sakkayaphop Pravesjit, A. Thammano","doi":"10.5220/0003511300470051","DOIUrl":"https://doi.org/10.5220/0003511300470051","url":null,"abstract":"Character segmentation is an important preprocessing step for character recognition. Incorrectly segmented characters are not likely to be correctly recognized. Touching characters is one of the most difficult segmentation cases which arise when handwritten characters are being segmented. Therefore, this paper emphasizes the interest to the segmentation of touching and overlapping characters. In the proposed character segmentation process, the bounding box analysis is initially employed to segment the document image into images of isolated characters and images of touching characters. The thinning algorithm is applied to extract the skeleton of the touching characters. Next, the skeleton of the touching characters is separated into several pieces. Finally, the separated pieces of the touching characters are put back to reconstruct two isolated characters. The proposed algorithm achieves an accuracy of 75.3%.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134315562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What are good CGS/MGS configurations for H.264 quality scalable coding?","authors":"Shih-Hsuan Yang, Wei-Lune Tang","doi":"10.5220/0003608201040109","DOIUrl":"https://doi.org/10.5220/0003608201040109","url":null,"abstract":"Scalable video coding (SVC) encodes image sequences into a single bit stream that can be adapted to various network and terminal capabilities. The H.264/AVC standard includes three kinds of video scalability, spatial scalability, temporal scalability, and quality scalability. Among them, quality scalability refers to image sequences of the same spatio-temporal resolution but with different fidelity levels. Two options of quality scalability are adopted in H.264/AVC, namely CGS (coarse-grain quality scalable coding) and MGS (medium-grain quality scalability), and they may be used in combinations. A refinement layer in CGS is obtained by re-quantizing the (residual) texture signal with a smaller quantization step size (QP). Using the CGS alone, however, may incur notable PSNR penalty and high encoding complexity if numerous rate points are required. MGS partitions the transform coefficients of a CGS layer into several MGS sub-layers and distributes them in different NAL units. The use of MGS may increase the adaptation flexibility, improve the coding efficiency, and reduce the coding complexity. In this paper, we investigate the CGS/MGS configurations that lead to good performance. From extensive experiments using the JSVM (Joint Scalable Video Model), however, we find that MGS should be carefully employed. Although MGS always reduces the encoding complexity as compared to using CGS alone, its rate-distortion is unstable. While MGS typically provides better or comparable rate-distortion performance for the cases with eight rate points or more, some configurations may cause an unexpected PSNR drop with an increased bit rate. This anomaly is currently under investigation.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128549929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time face recognition with GPUs: A DCT-based face recognition system using graphics processing unit","authors":"D. Alexiadis, A. Papastergiou, A. Hatzigaidas","doi":"10.5220/0003445601200125","DOIUrl":"https://doi.org/10.5220/0003445601200125","url":null,"abstract":"In this paper, we present an implementation of a 2-D DCT-based face recognition system, which uses a high performance parallel computing architecture, based on Graphics Processing Units (GPUs). Comparisons between the GPU-based and the “gold” CPU-based implementation in terms of execution time have been made. They show that the GPU implementation (NVIDIA GeForce GTS 250) is about 50 times faster than the CPU-based one (Intel Dual Core 1.83GHz), allowing the real-time operation of the developed face recognition system. Additionally, comparisons of the DCT-based approach with the PCA-based face recognition methodology shows that the DCT-based approach can achieve comparable recognition hit rates.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130553949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Pražák, Zdenek Loose, J. Psutka, V. Radová, L. Müller
{"title":"Four-phase re-speaker training system","authors":"A. Pražák, Zdenek Loose, J. Psutka, V. Radová, L. Müller","doi":"10.5220/0003604502170220","DOIUrl":"https://doi.org/10.5220/0003604502170220","url":null,"abstract":"Since the re-speaker approach to the automatic captioning of TV broadcastings using large vocabulary continuous speech recognition (LVCSR) is on the increase, there is also a growing demand for training systems that would allow new speakers to learn the procedure. This paper describes a specially designed re-speaker training system that provides gradual four-phase tutoring process with quantitative indicators of a trainee progress to enable faster (and thus cheaper) training of the re-speakers. The performance evaluation of three re-speakers who were trained on the proposed system is also reported.","PeriodicalId":103791,"journal":{"name":"Proceedings of the International Conference on Signal Processing and Multimedia Applications","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124846367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}