Jae-il Kim, Munchurl Kim, Sangjin Hahm, In-joon Cho, Changsub Park
{"title":"Block-Mode Classification Using SVMs for Early Termination of Block Mode Decision in H.264|MPEG-4 Part 10 AVC","authors":"Jae-il Kim, Munchurl Kim, Sangjin Hahm, In-joon Cho, Changsub Park","doi":"10.1109/ICAPR.2009.65","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.65","url":null,"abstract":"In this paper, a two-stage block-mode classification scheme of H.264|MPEG-4 Part 10 AVC is presented as a pattern classification approach using SVMs in order to reduce high computational complexity of its encoders. For the block-mode classification, the feature vectors for each macroblock are formed for the SVMs with SATD and CBP values to detect the large and small block modes. From the experimental results, the proposed scheme yields 80% and 95% of the correct classification rate in average for the first and second stage, which has led to from 35% to 55% reduction in the total encoding time while maintaining negligible amounts of bit rate increases and PSNR drops for test sequences with QCIF, CIF, and 4CIF resolutions and various quantization parameter values.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125716939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neurological Foundation of Image Processing","authors":"A. Przybyszewski","doi":"10.1109/ICAPR.2009.100","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.100","url":null,"abstract":"A popular computation approach is to process visual images by dividing them into crisp (winner-takes-all) parts in analog to properties of neurophysiological receptive fields. Problem with such symbolic representation is that in a real environment object attributes are seldom invariant. We propose to divide images into rough parts using hierarchical, multi-valued processes. The bottom-up computation (BUC) is related to prediction where object attributes are approximated by different granules with properties similar to different brain areas: by dots as in the thalamus, by oriented lines as in the primary visual cortex, and by elementary shapes as in V4. There are a large number of possible combinations of elementary granules; therefore objects in BUC are overrepresented. The top-down computation (TDC) fits prediction to hypothesis posed by more complex properties (higher brain areas). If the hypothesis check is positive, TDC verifies the object and eliminates other possible patterns. Such classifications take place in parallel at many functional units. We show an example of such hierarchical system computation on experimentally recorded data from monkey visual area (V4).","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122510966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection of Neural Activities in FMRI Using Jensen-Shannon Divergence","authors":"J. Basak","doi":"10.1109/ICAPR.2009.61","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.61","url":null,"abstract":"In this paper, we present a statistical technique based on Jensen-Shanon divergence for detecting the regions of activity in fMRI images. The method is model free and we exploit the metric property of the square root of Jensen-Shannon divergence to accumulate the variations between successive time frames of fMRI images. Experimentally we show the effectiveness of our algorithm.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126397534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Satellite Image Segmentation by Combining SA Based Fuzzy Clustering with Support Vector Machine","authors":"A. Mukhopadhyay, U. Maulik","doi":"10.1109/ICAPR.2009.50","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.50","url":null,"abstract":"Fuzzy clustering is an important tool for unsupervised pixel classification in remotely sensed satellite images. In this article, a Simulated Annealing (SA) based fuzzy clustering method is developed and combined with popular Support vector Machine (SVM) classifier to fine tune the clustering produced by SA for obtaining an improved clustering performance. The performance of the proposed technique has been compared with that of some other well-known algorithms for an IRS satellite image of the city of Kolkata and its superiority has been demonstrated quantitatively and visually.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"55 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134105291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking Multiple Circular Objects in Video Using Helmholtz Principle","authors":"Snehasis Mukherjee, D. Mukherjee","doi":"10.1109/ICAPR.2009.22","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.22","url":null,"abstract":"A novel algorithm is introduced to track multiple circular objects present in a video using Helmholtz perception principle. First, segmentation of circular objects in the video frame is performed using the perception principle and then same perception principle is applied to track the circular objects. For each circular object present in video, we have taken an assessment of the meaningfulness of the shift of its center of gravity and meaningfulness of the deviation of the direction of movement of the object due to inter-frame displacement. We have shown that a logical threshold in the meaningfulness value tracks circular objects in a video effectively and efficiently.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130705098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Infant Identification from Their Cry","authors":"H. Patil","doi":"10.1109/ICAPR.2009.73","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.73","url":null,"abstract":"Cry is the only means of communication for an infant. Understanding the properties of infant cry is very crucial for establishing a basis for using cry as a tool for pathological diagnosis or possibly identifying infants. In this paper, an attempt is made to identify infant from their cry. The experiments are shown for Linear Prediction Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), and Mel Frequency Cepstral Coefficients (MFCC)and Teager energy based MFCC(T-MFCC) as input feature vectors to the polynomial classi¿er of 2nd and 3rd order approximation. Results show that MFCC performs better than other features. This may be due to the fact that MFCC is designed to mimic human perception process and hence represent the perceptually relevant aspects of short-time infant cry spectrum.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133044613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Document Clustering for Event Identification and Trend Analysis in Market News","authors":"Lipika Dey, Anuj Mahajan, S. M. Haque","doi":"10.1109/ICAPR.2009.84","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.84","url":null,"abstract":"In this paper we have proposed a stock market analysis system that analyzes financial news items to identify and characterize major events that impact the market. The events have been identified using Latent Dirichlet Allocation(LDA) based topic extraction mechanism. The topic-document data is then clustered using kernel k means algorithm. The clusters are analyzed jointly with the SENSEX raw data to extract major events and their effects. The system has been implemented on capital market news about the Indian share market of the past three years.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115514363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variational Gaussian Mixture Models for Speech Emotion Recognition","authors":"H. K. Mishra, C. Sekhar","doi":"10.1109/ICAPR.2009.89","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.89","url":null,"abstract":"In this paper applicability of variational methods for estimation of parameters of models used for speech emotion recognition is discussed.When the amount of data available is not adequate for training complex models, variational Bayesian method helps in training models with less amount of data. It also helps in determining the optimal complexity of the model. Our studies on Berlin emotional speech database show that variational methods perform better than maximum likelihood approach to estimate parameters of Gaussian mixture models used in speech emotion recognition.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124459143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Texture Based Palmprint Identification Using DCT Features","authors":"M. Dale, M. Joshi, N. Gilda","doi":"10.1109/ICAPR.2009.76","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.76","url":null,"abstract":"In the palmprint recognition application utilizing more information other than principle lines or minutiae is much helpful. In this paper we proposed Discrete Cosine Transform (DCT) based feature vector for palmprint representation and matching and compared with DFT and Wavelet transform. Here the central part of the palmprint image of size 128x128 is resized to the size of 64x64 and divided into four non overlapping sub-images. The transform is applied on each sub-image directly without any preprocessing. By dividing the transformed sub-image into nine blocks, standard deviation is calculated for each block and such in total 36 (9x4=36) standard deviations will form the feature vector. This feature vector is used in matching stage. Total 10 images per person are taken from standard database available. Training set is prepared with the help of k images where k varies from 1 to 8. Results are checked against remaining images image in identification mode. Results are represented in terms of Genuine acceptance rate(%). In identification mode 97.5% recognition rate is obtained. The work is preliminary but recognition rate is promising and without any pre-processing.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"255 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124849106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Still Image and Video Fingerprinting","authors":"N. Nikolaidis, I. Pitas","doi":"10.1109/ICAPR.2009.83","DOIUrl":"https://doi.org/10.1109/ICAPR.2009.83","url":null,"abstract":"Multimedia fingerprinting, also know as robust/perceptual hashing and replica detection is an emerging technology that can be used as an alternative to watermarking for the efficient Digital Rights Management (DRM) of multimedia data. Two fingerprinting approaches are reviewed in this paper. The first is an image fingerprinting technique that makes use of color and texture descriptors,R-trees and Linear Discriminant Analysis (LDA). The second is a two-step, coarse-to-fine video fingerprinting method that involves color-based descriptors, R-trees and a frame-based voting procedure. Experimental performance evaluation is provided for both methods.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131718859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}