{"title":"KL Divergence Based Person Re-identification Using Multivariate Gaussian Distributions","authors":"Hongyuan Wang, Zongyuan Ding, Tongguang Ni, Fuhua Chen","doi":"10.1109/ACPR.2017.17","DOIUrl":"https://doi.org/10.1109/ACPR.2017.17","url":null,"abstract":"This paper focus on distributions of each class and proposes a novel person re-identification method using K-L divergence in the metric learning stage. The metric learning is not directly based on images or features, but directly based on distributions. The key idea of this paper is to assume that each person is a distribution and each image of a person is an instance of the distribution. Recognizing a probe becomes a task to determine which distribution the probe belongs to. In further, it assumes that the features of a person follow a multivariate Gaussian distribution and different people's distributions are different only with means of features but are same in their covariance matrices. The learning process is to find a global optimal covariance matrix among features for all of the distributions. A probe is then classified by comparing the K-L divergence with each class (distribution). The major contribution of this paper lies on the idea of distribution based metric learning methods, which is significantly different from most of the existing methods. Since the learning is among distributions, not images, the proposed model significantly reduced the computational cost and computational complexity and is much faster than traditional methods while the recognition rate is still quite competitive.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134645767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending the Full Procrustes Distance to Anisotropic Scale in Shape Analysis","authors":"Tsukasa Okamoto, Kazunori Iwata, N. Suematsu","doi":"10.1109/ACPR.2017.139","DOIUrl":"https://doi.org/10.1109/ACPR.2017.139","url":null,"abstract":"The full Procrustes distance between the configuration matrices of landmarks is the most fundamental landmark-based distance in shape analysis. To summarize, it is obtained by matching landmarks on a shape with those on another shape as closely as possible over the similarity transformations that consist of translation, rotation, and isotropic scaling. Thus, it considers similarity transformations only. Accordingly, it often does not work well for shapes skewed by non-similarity transformations. In this paper, we provide an efficient solution to this problem by extending the full Procrustes distance to anisotropic scale. With several shape datasets, we demonstrate that the extended full Procrustes distance is more effective in shape retrieval than typical distances, including the original full Procrustes distance.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130135499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying BinaryWeights in Neural Networks for Sequential Text Recognition","authors":"Zihe Wang, Chun Yang, Xu-Cheng Yin","doi":"10.1109/ACPR.2017.118","DOIUrl":"https://doi.org/10.1109/ACPR.2017.118","url":null,"abstract":"With the development of deep learning, researchers have achieved lots of breakthroughs in many classical problems. Unfortunately, these progresses are demanding on the hardwares, especially GPU, causing a huge energy consumption. Therefore, how to implement these neural networks with lower requirements of hardwares is holding more and more attention. In this paper, two aspects of the work were done. Firstly, we build a deep learning framework that supports training and prediction with binarized neural networks. This framework include binarized layers, e.g. Convolution (Conv) and LSTM layers. It is based on our analysis of how to implement a binarized layer and train BianryWeights with it. Secondly, we construct a network with binarized layers which are implemented in our framework to achieve good performance on sequential text recognition. We also modify the network architecture in order to obtain the experimental results with little loss on accuracy.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115358170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Kernel Block-Sparse Representation for Classification","authors":"Krishan Sharma, Renu M. Rameshan","doi":"10.1109/ACPR.2017.140","DOIUrl":"https://doi.org/10.1109/ACPR.2017.140","url":null,"abstract":"Block-sparse representation based classifier (BSRC), an extension of sparse representation based classification (SRC), shows good performance for face recognition task with small amount of training data. However, BSRC does not handle the inter-block and intra-block non-linearities present in the input feature space. As most of the real world data is non-linear, this paper presents a non-linear extension of BSRC, kernel block-sparse representation based classifier (KBSRC), using the kernel trick. We transform the data non-linearity to a higher dimensional space. Two convex optimization problems based on block sparsity are solved in this kernel feature space. Dimensionality reduction technique in kernel feature space is also applied to avoid the increase in computational complexity of the algorithm. In the reduced kernel feature subspace, a test sample is assigned a label based on block-sparse coefficients. To validate the proposed approach, experimentation over Extended Yale B Face, ISOLET and MNIST datasets is performed and it shows significant improvement in classification performance with less training data than state-of-the-art methods.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123530683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Classification to Regression: Model Transfer for Visual Aesthetic Quality Assessment","authors":"Wenzhen Huang, Peipei Yang, Kaiqi Huang","doi":"10.1109/ACPR.2017.127","DOIUrl":"https://doi.org/10.1109/ACPR.2017.127","url":null,"abstract":"Visual aesthetic quality assessment has played an important role in increasing number of computer vision applications. Particularly, estimating the quality score precisely is a main task of aesthetic quality assessment, but the training samples labeled with score are usually expensive to obtain. In this paper, we propose a transfer learning method which can improve the performance of aesthetic score prediction by using the coarse labeled samples, which are much easier to obtain. The proposed method incorporates the coarse information from source domain into the target domain by a novel multi-task framework, which can revise the model in target task. The effectiveness of our method is proven by experimental results that the error is reduced obviously with the help of source domain.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125010161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large Vocabulary Hybrid DNN/HMM Arabic Online Handwriting Recognition System","authors":"Omar Khaled Ali Ragab, A. Fahmy, Sherif M. Abdou","doi":"10.1109/ACPR.2017.114","DOIUrl":"https://doi.org/10.1109/ACPR.2017.114","url":null,"abstract":"Online Arabic handwriting recognition is a di cult problem since it is naturally both cursive and unconstrained. The analysis of Arabic script is further com-plicated due to obligatory dots/stokes that are placed above or below most letters and usually written de-layed in order. In addition, Arabic language is rich in morphology and syntax which makes it a must for a good online handwriting system to handle large vocabulary lexicon. Previously, Hidden Markov Model (HMM) with sequence reordering have provided a successful solution for most of the di culties inherent in recognizing Arabic handwriting. Recently, Deep Neu-ral Networks (DNN) have shown to provide signi cant improvement when integrated with HMM. In this paper we introduce the e orts done to build a large vocabulary Arabic HWR system using hybrid DNN/HMM model. This system used over segmentation to provide e cient decoding. The developed system was tested using a test set of 12k words written by 100 writers with lexicon size of 125k words. The system achieved an accuracy of 71.62%, 89.61% in rst recognized word and top ve recognized words respectively which to our knowledge is the best reported result for large vocabulary Arabic HWR.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124302765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correlated Motion Based Crowd Analysis in Queueing Situations","authors":"Csaba Beleznai, A. Zweng, Daniel Steininger","doi":"10.1109/ACPR.2017.117","DOIUrl":"https://doi.org/10.1109/ACPR.2017.117","url":null,"abstract":"Crowd analysis by automated visual surveillance represents a challenging task in many practically relevant scenarios. In this paper we address the problem of capturing relevant correlated movement within a line formed by waiting pedestrians to estimate the time needed for the last person to reach the queue front. To obtain a waiting time estimate we propose to solve two interlinked problems: queue shape delineation and motion characterization estimating the propagation velocity along the segmented queue. Accordingly, we present a scheme to reliably segment the queue shape by finding and refining an optimum path over time. The optimality condition refers to minimizing its length while maximizing its overlap with observed correlated motion patterns. To capture the collective motion of the crowd within the queue we employ a deformable chain structure to temporally aggregate the relevant short-term forward movement by tracking. The resulting tracked chain structure is used to generate a mean forward propagation velocity estimate. The presented approach represents a general analysis scheme, requiring only a set of tracked pedestrians on a calibrated ground plane at every frame. We validate our proposed scheme on two real datasets with time-varying queue structures. Based on a comparison to manually-set ground truth, obtained results show that queue delineation and waiting time estimates are reliable, can cope with motion clutter and well characterize the waiting behavior and its temporal evolution.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122457342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kotaro Abe, Brian Kenji Iwana, Viktor Gosta Holmer, S. Uchida
{"title":"Font Creation Using Class Discriminative Deep Convolutional Generative Adversarial Networks","authors":"Kotaro Abe, Brian Kenji Iwana, Viktor Gosta Holmer, S. Uchida","doi":"10.1109/ACPR.2017.99","DOIUrl":"https://doi.org/10.1109/ACPR.2017.99","url":null,"abstract":"In this research, we attempt to generate fonts automatically using a modification of a Deep Convolutional Generative Adversarial Network (DCGAN) by introducing class consideration. DCGANs are the application of generative adversarial networks (GAN) which make use of convolutional and deconvolutional layers to generate data through adversarial detection. The conventional GAN is comprised of two neural networks that work in series. Specifically, it approaches an unsupervised method of data generation with the use of a generative network whose output is fed into a second discriminative network. While DCGANs have been successful on natural images, we show its limited ability on font generation due to the high variation of fonts combined with the need of rigid structures of characters. We propose a class discriminative DCGAN which uses a classification network to work alongside the discriminative network to refine the generative network. This results of our experiment shows a dramatic improvement over the conventional DCGAN.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128680868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact of Ageing on EEG Based Biometric Systems","authors":"Barjinder Kaur, Pradeep Kumar, P. Roy, D. Singh","doi":"10.1109/ACPR.2017.33","DOIUrl":"https://doi.org/10.1109/ACPR.2017.33","url":null,"abstract":"With the development of sensor technology, Electroencephalography (EEG) has been a popular area of interest in recent years. Also, a great degree of changes with age have been found in face, voice, fingerprint or other physiological based biometric identifier systems. The distinct characteristic of neuro-signals have focused the attention of research community towards building a user identification system which is resistant to vulnerable attacks. However, the permanence issue of the brain signals has been studied sporadically. In this paper, we investigate the robustness of EEG signals to address the longitudinal stability issue and its effectiveness in user identification systems. Discrete Wavelet Transform (DWT) signal decomposition technique has been applied to extract Alpha-band waves. Further, two statistical features, namely, Root Mean Square (RMS) and Integrated EEG (IEEG) have been calculated for the band waves. Person identification has been performed using three well-known classification techniques, namely, Random Forest (RF), Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN). EEG data from 10 users has been recorded in 6 different sessions within a period of 6 months. Finally, a decision fusion scheme, majority voting has been applied to boost the system performance. An average accuracy of 80% has been recorded using decision fusion. The results highlight a significant amount of variations across sessions, which shows various factors could effect the state of the mind with temporal With the development of sensor technology, Electroencephalography (EEG) has been a popular area of interest in recent years. Also, a great degree of changes with age have been found in face, voice, fingerprint or other physiological based biometric identifier systems. The distinct characteristic of neuro-signals have focused the attention of research community towards building a user identification system which is resistant to vulnerable attacks. However, the permanence issue of the brain signals has been studied sporadically. In this paper, we investigate the robustness of EEG signals to address the longitudinal stability issue and its effectiveness in user identification systems. Discrete Wavelet Transform (DWT) signal decomposition technique has been applied to extract Alpha-band waves. Further, two statistical features, namely, Root Mean Square (RMS) and Integrated EEG (IEEG) have been calculated for the band waves. Person identification has been performed using three well-known classification techniques, namely, Random Forest (RF), Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN). EEG data from 10 users has been recorded in 6 different sessions within a period of 6 months. Finally, a decision fusion scheme, majority voting has been applied to boost the system performance. An average accuracy of 80% has been recorded using decision fusion. The results highlight a significant amount of variations across sessions, which shows various fac","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127770115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Adaptive Convolutional Neural Network Framework for Multi-user Myoelectric Interfaces","authors":"Keun-Tae Kim, Ki-Hee Park, Seong-Whan Lee","doi":"10.1109/ACPR.2017.52","DOIUrl":"https://doi.org/10.1109/ACPR.2017.52","url":null,"abstract":"Recently, the electromyogram (EMG)-based userinterfaces have developed for control of wearable rehabilitation robots such as arm prosthetics. In these interfaces, decoding of the user's movement intention is significant for controlling the robots properly. However, the high inter-user variations in EMG signals have disturbed to a stable decoding performance with multi-user. In this context, we developed an user-independent decoding method using the convolutional neural networks (CNN) for multi-user myoelectric interfaces. Specifically, we devise an user-adaptive framework based on the CNN for decoding of movement intentions using raw EMG signals. The Ninapro database was used to our experiments, and the experimental results show that our methods successfully decoded hand movement intentions. The effectiveness of the proposed method was also confirmed by experiment to decode movement intentions with across different subjects.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128364994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}