{"title":"Image hallucination at different times of day using locally affine model and kNN template matching from time-lapse images","authors":"N. Patel, Tushar Kataria","doi":"10.1145/3009977.3010038","DOIUrl":"https://doi.org/10.1145/3009977.3010038","url":null,"abstract":"Image Hallucination has many applications in areas such as image processing, computational photography and image fusion. In this paper, we present an image Hallucination technique based on the template (patch) matching from the database of time lapse images and learned locally affine model. Template based techniques suffer from blocky artifacts. So, we propose two approaches for imposing consistency criteria across neighbouring patches in the form of regularization. We validate our Color transfer technique by hallucinating a variety of natural images at different times the day. We compare the proposed approach with other state of the art techniques of example image based color transfer and show that the images obtained using our approach look more plausible and natural.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"4 1","pages":"30:1-30:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74431110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gorisha Agarwal, Ronak Garg, Divya Garg, B. Prasad, Tanima Dutta, Hari Prabhat Gupta
{"title":"A fast identity-independent expression recognition system for robust cartoonification using smart devices","authors":"Gorisha Agarwal, Ronak Garg, Divya Garg, B. Prasad, Tanima Dutta, Hari Prabhat Gupta","doi":"10.1145/3009977.3010055","DOIUrl":"https://doi.org/10.1145/3009977.3010055","url":null,"abstract":"Facial expressions convey rich information about emotions, intentions and other internal states of a person. Automatic facial expression and cartoonification systems are aiming towards the application of computer vision systems in human computer interaction, emotion analysis, medical care, virtual learning and even entertainment. In this paper, we propose an identity-independent robust system to detect human expression and generate their corresponding cartoonified images in real-time using smart-devices. Identity-independent expression recognition system enhances the facial features of query face image using its intra-class variation image and classifies using support vector machines. The method is robust to variation in identity and illumination of the query face image. Along with the basic expressions, like angry, happy and sad, we have also successfully detected the emotional states of sleepy and pain. The experimental results on JAFFE, CK+, PICS, Yalefaces, and Senthil databases show the effectiveness of the system.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"117 1","pages":"15:1-15:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88065387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vishal Jain, S. Zitha, A. Rajagopal, S. Biswas, H. S. Bharadwaj, K. Ramakrishnan
{"title":"Deep automatic license plate recognition system","authors":"Vishal Jain, S. Zitha, A. Rajagopal, S. Biswas, H. S. Bharadwaj, K. Ramakrishnan","doi":"10.1145/3009977.3010052","DOIUrl":"https://doi.org/10.1145/3009977.3010052","url":null,"abstract":"Automatic License Plate Recognition (ALPR) has important applications in traffic surveillance. It is a challenging problem especially in countries like in India where the license plates have varying sizes, number of lines, fonts etc. The difficulty is all the more accentuated in traffic videos as the cameras are placed high and most plates appear skewed. This work aims to address ALPR using Deep CNN methods for real-time traffic videos. We first extract license plate candidates from each frame using edge information and geometrical properties, ensuring high recall. These proposals are fed to a CNN classifier for License Plate detection obtaining high precision. We then use a CNN classifier trained for individual characters along with a spatial transformer network (STN) for character recognition. Our system is evaluated on several traffic videos with vehicles having different license plate formats in terms of tilt, distances, colors, illumination, character size, thickness etc. Results demonstrate robustness to such variations and impressive performance in both the localization and recognition. We also make available the dataset for further research on this topic.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"23 1","pages":"6:1-6:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87300186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3D binary signatures","authors":"Siddharth Srivastava, Brejesh Lall","doi":"10.1145/3009977.3010009","DOIUrl":"https://doi.org/10.1145/3009977.3010009","url":null,"abstract":"In this paper, we propose a novel binary descriptor for 3D point clouds. The proposed descriptor termed as 3D Binary Signature (3DBS) is motivated from the matching efficiency of the binary descriptors for 2D images. 3DBS describes keypoints from point clouds with a binary vector resulting in extremely fast matching. The method uses keypoints from standard keypoint detectors. The descriptor is built by constructing a Local Reference Frame and aligning a local surface patch accordingly. The local surface patch constitutes of identifying nearest neighbours based upon an angular constraint among them. The points are ordered with respect to the distance from the keypoints. The normals of the ordered pairs of these keypoints are projected on the axes and the relative magnitude is used to assign a binary digit. The vector thus constituted is used as a signature for representing the keypoints. The matching is done by using hamming distance. We show that 3DBS outperforms state of the art descriptors on various evaluation metrics.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"87 3 1","pages":"77:1-77:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82134389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mrinmoy Ghorai, Pulak Purkait, Sanchayan Santra, S. Samanta, B. Chanda
{"title":"Bishnupur heritage image dataset (BHID): a resource for various computer vision applications","authors":"Mrinmoy Ghorai, Pulak Purkait, Sanchayan Santra, S. Samanta, B. Chanda","doi":"10.1145/3009977.3010005","DOIUrl":"https://doi.org/10.1145/3009977.3010005","url":null,"abstract":"Bishnupur is an attractive tourist place in West Bengal, India and is known for its terracotta temples. The place is one of the prospective candidates to be included in the list of UNESCO World Heritage sites. We intend to preserve this heritage site digitally and also to present some virtual interaction for the tourist and researchers. In this paper, we present an image dataset of different temples (namely, Jor Bangla, Kalachand, Madan Mohan, Radha Madhav, Rasmancha, Shyamrai and Nandalal) in Bishnupur for evaluating different types of computer vision and image processing algorithms (like 3D reconstruction, image inpainting, texture classification and content specific image retrieval). The dataset is captured using four different cameras with different parameter settings. Some datasets are extracted and earmarked for certain applications such as texture classification, image inpainting and content specific image retrieval. Example results of baseline methods are also shown for these applications. Thus we evaluate the usefulness of this dataset. To the best of our knowledge, probably this is the first attempt of combined dataset for evaluating various types of problems for a heritage site in India. The dataset is publicly available at http://www.isical.ac.in/~bsnpr/ for research purpose only.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"151 1","pages":"80:1-80:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74089731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classification of Schizophrenia versus normal subjects using deep learning","authors":"Pinkal Patel, P. Aggarwal, Anubha Gupta","doi":"10.1145/3009977.3010050","DOIUrl":"https://doi.org/10.1145/3009977.3010050","url":null,"abstract":"Motivated by deep learning approaches to classify normal and neuro-diseased subjects in functional Magnetic Resonance Imaging (fMRI), we propose stacked autoencoder (SAE) based 2-stage architecture for disease diagnosis. In the proposed architecture, a separate 4-hidden layer autoencoder is trained in unsupervised manner for feature extraction corresponding to every brain region. Thereafter, these trained autoencoders are used to provide features on class-labeled input data for training a binary support vector machine (SVM) based classifier. In order to design a robust classifier, noisy or inactive gray matter voxels are filtered out using a proposed covariance based approach. We applied the proposed methodology on a public dataset, namely, 1000 Functional Connectomes Project Cobre dataset consisting of fMRI data of normal and Schizophrenia subjects. The proposed architecture is able to classify normal and Schizophrenia subjects with 10-fold cross-validation accuracy of 92% that is better compared to the existing methods used on the same dataset.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"3 1","pages":"28:1-28:6"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81908806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Alternate formulation for transform learning","authors":"Jyoti Maggu, A. Majumdar","doi":"10.1145/3009977.3010069","DOIUrl":"https://doi.org/10.1145/3009977.3010069","url":null,"abstract":"Dictionary learning has been used to solve inverse problems in imaging and as an unsupervised feature extraction tool in vision. The main disadvantage of dictionary learning for applications in vision is the relatively long feature extraction time during testing; owing to the requirement of solving an iterative optimization problem (l0-minimization). The newly developed analysis framework of transform learning does not suffer from this shortcoming; feature extraction only requires a matrix vector multiplication. This work proposes an alternate formulation for transform learning that improves the accuracy even further. Experiments on benchmark databases show that our proposed transform learning yields results better than dictionary learning, autoencoder (AE) and restricted Boltzmann machine (RBM). The feature extraction time is fast as AE and RBM.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"36 1","pages":"50:1-50:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85191663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatio-temporal weighted histogram based mean shift for illumination robust target tracking","authors":"K. Deopujari, R. Velmurugan, K. Tiwari","doi":"10.1145/3009977.3010059","DOIUrl":"https://doi.org/10.1145/3009977.3010059","url":null,"abstract":"This paper proposes a simple method to handle illumination variation in a video. The proposed method is based on generative mean shift tracker, which uses energy compaction property of discrete Cosine transform (DCT) to handle illumination variation within and across frames. The proposed method uses spatial and temporal DCT coefficient based approach to assign weights to target and candidate histograms in mean shift. The proposed weighing factor takes care of changes in illumination within a frame i.e., illumination change of the target with respect to background and also across the frames i.e., varying illumination between the consecutive time instances. The algorithm was tested using VOT2015 challenge dataset and also on sequences from OTB and CAVIAR datasets. The proposed method was also tested rigorously for illumination attribute. The qualitative and quantitative evaluation process of the proposed method was twofold. First, the tracker was compared with existing DCT coefficient based method and showed improved results. Secondly, the proposed algorithm was compared with other state of the art trackers. The results show that the proposed algorithm outperformed some state-of-the-art trackers while with others it showed comparable performance.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"5 1","pages":"40:1-40:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91093053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Malladi, Bijju Kranthi Veduruparthi, J. Mukherjee, P. Das, S. Chakrabarti, I. Mallick
{"title":"Reduction of variance of observations on pelvic structures in CBCT images using novel mean-shift and mutual information based image registration?","authors":"S. Malladi, Bijju Kranthi Veduruparthi, J. Mukherjee, P. Das, S. Chakrabarti, I. Mallick","doi":"10.1145/3009977.3010030","DOIUrl":"https://doi.org/10.1145/3009977.3010030","url":null,"abstract":"In this paper, Cone-Beam Computed Tomography(CBCT) image data of colorectal cancer patients are considered for registering standard reference locations of bony structures in pelvic region. A solution is provided in this paper to automatically compute and resolve irregularities involved in locating bony structures in the pelvic region. A new algorithm is proposed to automatically locate the lowest 3D coordinates of the Pubic Symphysis (pb) and the Coccyx on a daily basis. The irregularities involved are reduced to minimum by registering CBCT images. The conventional three dimensional mutual information (MI) based registration and a novel mean shift based mutual information techniques are compared. The variations in the position of pelvic region are also compared for unregistered and registered CBCT images. The proposed algorithm, tested on CBCT image data of 25 patients, each taken over a span of 27 days consecutively, provide promising results. The variations in the locations of coccyx, pb, and the distance between them were found to be reduced due to registration of 3D CBCT images.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"16 2 1","pages":"84:1-84:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89937482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bappaditya Chakraborty, P. Mukherjee, U. Bhattacharya
{"title":"Bangla online handwriting recognition using recurrent neural network architecture","authors":"Bappaditya Chakraborty, P. Mukherjee, U. Bhattacharya","doi":"10.1145/3009977.3010072","DOIUrl":"https://doi.org/10.1145/3009977.3010072","url":null,"abstract":"Recognition of unconstrained handwritten texts is always a difficult problem, particularly if the style of handwriting is a mixed cursive one. Among various Indian scripts, only Bangla has this additional difficulty of tackling mixed cur-siveness of its handwriting style in the pipeline of a method towards its automatic recognition. A few other common recognition difficulties of handwriting in an Indian script include the large size of its alphabet and the extremely cursive nature of the shapes of its alphabetic characters. These are among the reasons of achieving only limited success in the study of unconstrained handwritten Bangla text recognition. Artificial Neural Network (ANN) models have often been used for solving difficult real-life pattern recognition problems. Recurrent Neural Network models (RNN) have been studied in the literature for modeling sequence data. In this study, we consider Long Short Term Memory (LSTM) network model, a useful member of this family. In fact, Bidirectional Long Short-Term Memory (BLSTM) neural networks is a special kind of RNN and have recently attracted special attention in solving sequence labelling problems. In this article, we present a BLSTM architecture based approach for unconstrained online handwritten Bangla text recognition.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"39 1","pages":"63:1-63:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87736153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}