{"title":"Learning Light Field Reconstruction from a Single Coded Image","authors":"Anil Kumar Vadathya, Saikiran Cholleti, Gautham Ramajayam, Vijayalakshmi Kanchana, K. Mitra","doi":"10.1109/ACPR.2017.142","DOIUrl":"https://doi.org/10.1109/ACPR.2017.142","url":null,"abstract":"Light field imaging is a rich way of representing the 3D world around us. However, due to limited sensor resolution capturing light field data inherently poses spatio-angular resolution trade-off. In this paper, we propose a deep learning based solution to tackle the resolution trade-off. Specifically, we reconstruct full sensor resolution light field from a single coded image. We propose to do this in three stages 1) reconstruction of center view from the coded image 2) estimating disparity map from the coded image and center view 3) warping center view using the disparity to generate light field. We propose three neural networks for these stages. Our disparity estimation network is trained in an unsupervised manner alleviating the need for ground truth disparity. Our results demonstrate better recovery of parallax from the coded image. Also, we get better results than dictionary learning approaches on simulated data.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"3 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114039188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peikang Lin, Xianjie Mo, Guidong Lin, Liwen Ling, Tingting Wei, W. Luo
{"title":"A News-Driven Recurrent Neural Network for Market Volatility Prediction","authors":"Peikang Lin, Xianjie Mo, Guidong Lin, Liwen Ling, Tingting Wei, W. Luo","doi":"10.1109/ACPR.2017.35","DOIUrl":"https://doi.org/10.1109/ACPR.2017.35","url":null,"abstract":"Extracting hidden information embedded in the financial news is an effective approach to market volatility prediction. In this paper, we propose a recurrent neural network (RNN) based method that dynamically extracts latent structures from the sequence of news events for market prediction. Specifically, we first train a skip-thought model on the financial news datasets to represent the semantic meaning of sentences. Then we aggregate the representations from a single day to form a daily feature to align with the market index. Finally, to make use of the news released some days ago for a better prediction, we exploit the long short-term memory RNN (LSTM-RNN) to integrate the information by exploring the dynamic patterns embedded in the sequence of news events. Extensive experiments on the public available Reuters and Bloomberg financial news datasets verified the effectiveness of our method and demonstrated that our method achieves the state-of-the-art performance.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"237 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125731020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhaobai Zhong, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
{"title":"Handwritten Chinese Character Blind Inpainting with Conditional Generative Adversarial Nets","authors":"Zhaobai Zhong, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu","doi":"10.1109/ACPR.2017.60","DOIUrl":"https://doi.org/10.1109/ACPR.2017.60","url":null,"abstract":"It is very common to use a regular grid like Tian-zi-ge or Mi-zi-ge to help writing in Chinese handwriting environment, especially in education and postal area. Although regular grid is helpful for writing, it is a disaster for recognition. This paper focuses on handwritten Chinese character blind inpainting with regular grid and spot. To solve this problem, we use the recently proposed conditional generative adversarial nets (GANs). Different from the traditional engineering based method like line detection or edge detection, conditional GANs learn a map between target and training data. The generator reconstructs character directly from the data and the discriminator guides the training process to make the generated character more realistic. In this paper, we can automatically remove regular grid in handwritten Chinese character and reconstruct the character's strokes correctly. Moreover, the evaluation on classification task achieved a near state-of-the-art performance on the simulation database and got a convincing result on real world regular grid handwritten Chinese character database.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121401382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pradeep Kumar, Rajkumar Saini, Chaitanya Sai Tumma, P. Roy, D. P. Dogra
{"title":"Gait Analysis Using Shadow Motion","authors":"Pradeep Kumar, Rajkumar Saini, Chaitanya Sai Tumma, P. Roy, D. P. Dogra","doi":"10.1109/ACPR.2017.32","DOIUrl":"https://doi.org/10.1109/ACPR.2017.32","url":null,"abstract":"Gait is considered as one of the biometric traits that does not require physical interaction with machines and can be performed at a distance from the computing device. However, majority of the gait recognition systems require the subjects to be monitored in constrained environment within the viewing field of the capturing device. Such systems may fail to recognize a few of the features when the interaction environment is changed or when the body occlusion occurs due to position variations, clothing or belongings. Moreover, the walking style of a user may vary when engaged in different activities such as listening to music, playing games, fast walking, etc. In this paper, we propose a new approach of human gait recognition using Shadow motion sensor, a full body sensor unit. The framework is able to identify users robustly despite changes in their appearances. The device uses a combination of accelerometer, gyroscope and magnetometer sensors for collecting gait features. The identification process is performed using a Random Forest based classification scheme by varying number of trees. A set of users comprising with 23 males and females have participated in the data collection and they have performed four different types of walks including, normal-walk, fastwalk, walking while listening to music and walking while watching video on mobile. An average accuracy of 87.68% has been recorded in all walk scenarios. Results reveal that the proposed study can be used as a stepping stone to design robust gait biometric systems with the help of contact less sensors.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129176413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Age-Invariant Person Identification by Segmentation Verification of Face Image","authors":"Yuta Somada, W. Ohyama, T. Wakabayashi","doi":"10.1109/ACPR.2017.157","DOIUrl":"https://doi.org/10.1109/ACPR.2017.157","url":null,"abstract":"Face recognition has been a major research theme over the last two decades. There are several problems to be solved to improve the performance of face recognition. Such major problems involve appearance variation due to pose, illumination, expression, and aging. In particular, aging includes internal and external factors that cause facial appearance variation and, consequently, it is the most difficult problem to handle. In this paper, we propose a face recognition method that is robust against facial appearance variation due to aging. The proposed method employs segmentation verification of frontal face images that consists of the following three steps. (1) Face image segmentation generates three regional subimages from the input face image. (2) A matching score is calculated using gradient features from a pair consisting of the input image and a registered image for each of the three generated subimages and original (whole face) image. We obtain four matching scores. (3) The verifying classifier evaluates the matching score vector formed of the matching scores calculated for each of the four images and predicts the a posteriori probability that two matching images belong to the same person. The results of an experimental evaluation with the FGNET and MORPH face aging datasets clarify the effectiveness of the proposed method for age invariant face recognition","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128534678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karthik Gopinath, Samrudhdhi B. Rangrej, J. Sivaswamy
{"title":"A Deep Learning Framework for Segmentation of Retinal Layers from OCT Images","authors":"Karthik Gopinath, Samrudhdhi B. Rangrej, J. Sivaswamy","doi":"10.1109/ACPR.2017.121","DOIUrl":"https://doi.org/10.1109/ACPR.2017.121","url":null,"abstract":"Segmentation of retinal layers from Optical Coherence Tomography (OCT) volumes is a fundamental problem for any computer aided diagnostic algorithm development. This requires preprocessing steps such as denoising, region of interest extraction, flattening and edge detection all of which involve separate parameter tuning. In this paper, we explore deep learning techniques to automate all these steps and handle the presence/absence of pathologies. A model is proposed consisting of a combination of Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). The CNN is used to extract layers of interest image and extract the edges, while the LSTM is used to trace the layer boundary. This model is trained on a mixture of normal and AMD cases using minimal data. Validation results on three public datasets show that the pixel-wise mean absolute error obtained with our system is 1.30±0.48 which is lower than the inter-marker error of 1.79±0.76. Our model's performance is also on par with the existing methods.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116093327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating the Stacked Phonetic Bottleneck Feature for Speaker Verification with Short Voice Commands","authors":"Yichi Huang, Yuexian Zou, Yi Liu","doi":"10.1109/ACPR.2017.74","DOIUrl":"https://doi.org/10.1109/ACPR.2017.74","url":null,"abstract":"Text-dependent speaker verification (SV) with short voice command (SV-SVC) has increasing demand in many applications. Different from conventional SV, SV-SVC usually uses short fixed voice commands for user-friendly purpose, which causes technical challenges compared with conventional text-dependent SV using fixed phrases (SV-FP). Research results show that the mainstream SV techniques are not able to provide good performance for SV-SVC tasks since they suffer from strongly lexical-overlapping and short utterance length problems. In this paper, we propose to fully explore the acoustic features and contextual information of the phonetic units to obtain better speaker-utterance related information representation for i-vector based SV-SVC systems. Specifically, instead of using MFCC only, the frame-based phonetic bottleneck (PBN) feature extracted from a phonetic bottleneck neural network (PBNN), the stacked phonetic bottleneck (SBN) feature, the cascaded feature of PBN and MFCC, the cascaded feature of SBN and MFCC (SBNF+MFCC) are extracted for developing i-vector based SV-SVC systems. Intensive experiments on the benchmark database RSR2015 have been conducted to evaluate the performance of our proposed ivector SV-SVC systems. It is encouraged that the contextual information learnt from stacked PBNN does help and proposed ivector SV-SVC system with (SBNF+MFCC) outperforms under experimental conditions.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115681531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reweighted Low Rank Representation Based on Fractional-Order Function","authors":"Yiqiang Zhai, Zexuan Ji","doi":"10.1109/ACPR.2017.132","DOIUrl":"https://doi.org/10.1109/ACPR.2017.132","url":null,"abstract":"Low Rank Representation (LRR) achieves state-of-the-art clustering performance via solving a nuclear norm minimization problem which is a convex relaxation of rank minimization. In this paper, we propose a unified fractional-order function based weighted nuclear norm minimization framework (FWNNM), which can approximate rank minimization better than nuclear norm minimization. Based on the unified framework, a fractional-order function is introduced to reweight the low rank representation (FRLRR) to further improve the lower rank representation of data. By imposing constraints on the eigenvalues of coefficient matrix, the proposed weights are embedded into the formulation to obtain the lower rank representation in each iteration. Experimental results demonstrate the advantage of FRLRR over state-of-the-art methods.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114476170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sequence-to-Sequence Learning for Human Pose Correction in Videos","authors":"S. Swetha, V. Balasubramanian, C. V. Jawahar","doi":"10.1109/ACPR.2017.126","DOIUrl":"https://doi.org/10.1109/ACPR.2017.126","url":null,"abstract":"The power of ConvNets has been demonstrated in a wide variety of vision tasks including pose estimation. But they often produce absurdly erroneous predictions in videos due to unusual poses, challenging illumination, blur, self-occlusions etc. These erroneous predictions can be refined by leveraging previous and future predictions as the temporal smoothness constrain in the videos. In this paper, we present a generic approach for pose correction in videos using sequence learning that makes minimal assumptions on the sequence structure. The proposed model is generic, fast and surpasses the state-of-the-art on benchmark datasets. We use a generic pose estimator for initial pose estimates, which are further refined using our method. The proposed architecture uses Long Short-Term Memory (LSTM) encoder-decoder model to encode the temporal context and refine the estimations. We show 3.7% gain over the baseline Yang & Ramanan (YR) and 2.07% gain over Spatial Fusion Network (SFN) on a new challenging YouTube Pose Subset dataset.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116302844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Adaptive Locality Preserving Projections for Face Recognition","authors":"Jun Fan, Qiaolin Ye, Ning Ye","doi":"10.1109/ACPR.2017.123","DOIUrl":"https://doi.org/10.1109/ACPR.2017.123","url":null,"abstract":"In this paper, we address the graph-based manifold learning method for face recognition. The proposed method is called enhanced adaptive Locality Preserving Projections. The EALPP integrates four properties: (i) introduction of data label information and parameterless computation of affinity matrix, (ii) QR-decomposition for acceleration of the eigenvector computation, (iii) matrix exponential for solving the problem of singular matrix and (iv) processing of uncorrelated vector of projection matrix. EALPP has been integrated two techniques: Maximum Margin Criterion (MMC) and Locality Preserving Projections (LPP). Face recognition test on four public face databases (ORL, Yale, AR and UMIST) and experimental results demonstrate the effectiveness of EALPP.","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114665459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}