{"title":"Action recognition by single stream convolutional neural networks: An approach using combined motion and static information","authors":"Sameera Ramasinghe, R. Rodrigo","doi":"10.1109/ACPR.2015.7486474","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486474","url":null,"abstract":"We investigate the problem of automatic action recognition and classification of videos. In this paper, we present a convolutional neural network architecture, which takes both motion and static information as inputs in a single stream. We show that the network is able to treat motion and static information as different feature maps and extract features off them, although stacked together. We trained and tested our network on Youtube dataset. Our network is able to surpass state-of-the-art hand-engineered feature methods. Furthermore, we also studied and compared the effect of providing static information to the network, in the task of action recognition. Our results justify the use of optic flows as the raw information of motion and also show the importance of static information, in the context of action recognition.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126013001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"My camera can see through fences: A deep learning approach for image de-fencing","authors":"Sankaraganesh Jonna, K. K. Nakka, R. R. Sahay","doi":"10.1109/ACPR.2015.7486506","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486506","url":null,"abstract":"In recent times, the availability of inexpensive image capturing devices such as smartphones/tablets has led to an exponential increase in the number of images/videos captured. However, sometimes the amateur photographer is hindered by fences in the scene which have to be removed after the image has been captured. Conventional approaches to image de-fencing suffer from inaccurate and non-robust fence detection apart from being limited to processing images of only static occluded scenes. In this paper, we propose a semi-automated de-fencing algorithm using a video of the dynamic scene. We use convolutional neural networks for detecting fence pixels. We provide qualitative as well as quantitative comparison results with existing lattice detection algorithms on the existing PSU NRT data set [1] and a proposed challenging fenced image dataset. The inverse problem offence removal is solved using split Bregman technique assuming total variation of the de-fenced image as the regularization constraint.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129030296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Uniform low-rank representation for unsupervised visual domain adaptation","authors":"Pengcheng Liu, Peipei Yang, Kaiqi Huang, T. Tan","doi":"10.1109/ACPR.2015.7486497","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486497","url":null,"abstract":"Visual domain adaptation aims to adapt a model learned in source domain to target domain, which has received much attention in recent years. In this paper, we propose a uniform low-rank representation based unsupervised domain adaptation method which captures the intrinsic relationship among the source and target samples and meanwhile eliminates the disturbance from the noises and outliers. In particular, we first align the source and target samples into a common subspace using a subspace alignment technique. Then we learn a domain-invariant dictionary with respect to the transformed source and target samples. Finally, all the transformed samples are low-rank represented based on the learned dictionary. Extensive experimental results show that our method is beneficial to reducing the domain difference, and we achieve the state-of-the-art performance on the widely used visual domain adaptation benchmark.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133611442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DASA: Domain adaptation in stacked autoencoders using systematic dropout","authors":"Abhijit Guha Roy, D. Sheet","doi":"10.1109/ACPR.2015.7486600","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486600","url":null,"abstract":"Domain adaptation deals with adapting behaviour of machine learning based systems trained using samples in source domain to their deployment in target domain where the statistics of samples in both domains are dissimilar The task of directly training or adapting a learner in the target domain is challenged by lack of abundant labeled samples. In this paper we propose a technique for domain adaptation in stacked autoencoder (SAE) based deep neural networks (DNN) performed in two stages: (i) unsupervised weight adaptation using systematic dropouts in mini-batch training, (ii) supervised fine-tuning with limited number of labeled samples in target domain. We experimentally evaluate performance in the problem of retinal vessel segmentation where the SAE-DNN is trained using large number of labeled samples in the source domain (DRIVE dataset) and adapted using less number of labeled samples in target domain (STARE dataset). The performance of SAE-DNN measured using logloss in source domain is 0.19, without and with adaptation are 0.40 and 0.18, and 0.39 when trained exclusively with limited samples in target domain. The area under ROC curve is observed respectively as 0.90, 0.86, 0.92 and 0.87. The high efficiency of vessel segmentation with DASA strongly substantiates our claim.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130619287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kohei Matsuzaki, Yusuke Uchida, S. Sakazawa, S. Satoh
{"title":"Local feature reliability measure using multiview synthetic images for mobile visual search","authors":"Kohei Matsuzaki, Yusuke Uchida, S. Sakazawa, S. Satoh","doi":"10.1109/ACPR.2015.7486485","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486485","url":null,"abstract":"In this paper, we propose a new database (DB) construction method for the mobile visual search (MVS) system based on the local feature and bag-of-visual-words framework. In MVS, quantization error is unavoidable and causes performance degradation. Typical approaches for visual search extract features from a single view of reference images, though such features are insufficient to manage the quantization error. In this paper, we generate multiview synthetic images and extract local features. These features are resampled according to our novel reliability measure in order to reduce the DB size. Experiments on the three datasets show that the proposed method successfully constructs a robust DB with same size. The proposed method improved the mean average precision compared with a conventional method without changing the searching procedure.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116989330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reduce false positives for human detection by a priori probability in videos","authors":"Lei Wang, Xu Zhao, Yuncai Liu","doi":"10.1109/ACPR.2015.7486570","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486570","url":null,"abstract":"In this work, we address the problem of reducing the false positives for human detection in videos. We employ the motion cue to build a foreground probability model. Then the mean expectation of the pixel-level foreground probability is computed to assign a priori probability to the sliding window in detection. We combine the response of Deformable Part Models and the mean probability expectation to form the features and train a linear classifier. The proposed approach is threshold-free, and reduces the false positives in human detection by the foreground cues. As well, we describe an integral probability image for fast computation of the mean probability expectation. Experimental results show that the proposed method achieve superior performance over the baseline of Deformable Part Models.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132824185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Augmented text character proposals and convolutional neural networks for text spotting from scene images","authors":"Alessandro Zamberletti, I. Gallo, L. Noce","doi":"10.1109/ACPR.2015.7486493","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486493","url":null,"abstract":"In this work we propose a novel method for text spotting from scene images based on augmented Multi-resolution Maximally Stable Extremal Regions and Convolutional Neural Networks. The goal of this work is augmenting text character proposals to maximize their coverage rate over text elements in scene images, to obtain satisfying text detection rates without the need of using very deep architectures nor large amount of training data. Using simple and fast geometric transformations on multi-resolution proposals our system achieves good results for several challenging datasets while also being computationally efficient to train and test on a desktop computer.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128180327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust road lane detection using extremal-region enhancement","authors":"Jingchen Gu, Qieshi Zhang, S. Kamata","doi":"10.1109/ACPR.2015.7486557","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486557","url":null,"abstract":"Road lane detection is a key problem in advanced driver-assistance systems (ADAS). For solving this problem, vision-based detection methods are widely used and are generally focused on edge information. However, only using edge information leads to miss detection and error detection in various road conditions. In this paper, we propose a neighbor-based image conversion method, called extremal-region enhancement. The proposed method enhances the white lines in intensity, hence it is robust to shadows and illuminance changes. Both edge and shape information of white lines are extracted as lane features in the method. In addition, we implement a robust road lane detection algorithm using the extracted features and improve the correctness through probability tracking. The experimental result shows an average detection rate increase of 13.2% over existing works.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128218325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianxia Gong, Abhinit Kumar Ambastha, C. Tan, Bolan Su, Tchoyoson C. C. Lim
{"title":"Automated prognosis analysis for traumatic brain injury CT images","authors":"Tianxia Gong, Abhinit Kumar Ambastha, C. Tan, Bolan Su, Tchoyoson C. C. Lim","doi":"10.1109/ACPR.2015.7486531","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486531","url":null,"abstract":"Traumatic brain injury (TBI) is a major cause of deaths worldwide. In this paper, we propose a framework for automatic brain CT image analysis and Glasgow Outcome Scale (GOS) prediction for TBI cases. For each TBI case, we first select a fixed number of images to represent the case, then we extract Gabor features from these images and form a feature vector. As a large number of features are extracted from the images, we use PCA to select the features for training and testing. We then use random forest for training and testing of our prognosis model. The overall accuracy of binary GOS classification is between 73% and 75% for different GOS dichotomizations.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132200925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hyperspectral image classification using Gradient Local Auto-Correlations","authors":"Cheng Chen, Junjun Jiang, Baochang Zhang, Wankou Yang, Jianzhong Guo","doi":"10.1109/ACPR.2015.7486544","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486544","url":null,"abstract":"Spatial information has been verified to be helpful in hyperspectral image classification. In this paper, a spatial feature extraction method utilizing spatial and orientational auto-correlations of image local gradients is presented for hyperspectral imagery (HSI) classification. The Gradient Local Auto-Correlations (GLAC) method employs second order statistics (i.e., auto-correlations) to capture richer information from images than the histogram-based methods (e.g., Histogram of Oriented Gradients) which use first order statistics (i.e., histograms). The experiments carried out on two hyperspectral images proved the effectiveness of the proposed method compared to the state-of-the-art spatial feature extraction methods for HSI classification.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134229486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}