K. Bora, M. Chowdhury, L. Mahanta, M. Kundu, A. Das
{"title":"Pap smear image classification using convolutional neural network","authors":"K. Bora, M. Chowdhury, L. Mahanta, M. Kundu, A. Das","doi":"10.1145/3009977.3010068","DOIUrl":"https://doi.org/10.1145/3009977.3010068","url":null,"abstract":"This article presents the result of a comprehensive study on deep learning based Computer Aided Diagnostic techniques for classification of cervical dysplasia using Pap smear images. All the experiments are performed on a real indigenous image database containing 1611 images, generated at two diagnostic centres. Focus is given on constructing an effective feature vector which can perform multiple level of representation of the features hidden in a Pap smear image. For this purpose Deep Convolutional Neural Network is used, followed by feature selection using an unsupervised technique with Maximal Information Compression Index as similarity measure. Finally performance of two classifiers namely Least Square Support Vector Machine (LSSVM) and Softmax Regression are monitored and classifier selection is performed based on five measures along with five fold cross validation technique. Output classes reflects the established Bethesda system of classification for identifying pre-cancerous and cancerous lesion of cervix. The proposed system is also compared with two existing conventional systems and also tested on a publicly available database. Experimental results and comparison shows that proposed system performs efficiently in Pap smear classification.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"15 1","pages":"55:1-55:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78090197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving person re-identification systems: a novel score fusion framework for rank-n recognition","authors":"Arko Barman, S. Shah","doi":"10.1145/3009977.3010018","DOIUrl":"https://doi.org/10.1145/3009977.3010018","url":null,"abstract":"Person re-identification is an essential technique for video surveillance applications. Most existing algorithms for person re-identification deal with feature extraction, metric learning or a combination of both. Combining successful state-of-the-art methods using score fusion from the perspective of person re-identification has not yet been widely explored. In this paper, we endeavor to boost the performance of existing systems by combining them using a novel score fusion framework which requires no training or dataset-dependent tuning of parameters. We develop a robust and efficient method called Unsupervised Posterior Probability-based Score Fusion (UPPSF) for combination of raw scores obtained from multiple existing person re-identification algorithms in order to achieve superior recognition rates. We propose two novel generalized linear models for estimating the posterior probabilities of a given probe image matching each of the gallery images. Normalization and combination of these posterior probability values computed from each of the existing algorithms individually, yields a set of unified scores, which is then used for ranking the gallery images. Our score fusion framework is inherently capable of dealing with different ranges and distributions of matching scores emanating from existing algorithms, without requiring any prior knowledge about the algorithms themselves, effectively treating them as \"black-box\" methods. Experiments on widely-used challenging datasets like VIPeR, CUHK01, CUHK03, ETHZ1 and ETHZ2 demonstrate the efficiency of UPPSF in combining multiple algorithms at the score level.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"239 1","pages":"4:1-4:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74985051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ghosting free HDR for dynamic scenes via shift-maps","authors":"K. Prabhakar, R. Venkatesh Babu","doi":"10.1145/3009977.3010034","DOIUrl":"https://doi.org/10.1145/3009977.3010034","url":null,"abstract":"Given a set of sequential exposures, High Dynamic Range imaging is a popular method for obtaining high-quality images for fairly static scenes. However, this typically suffers from ghosting artifacts for scenes with significant motion. Also, existing techniques cannot handle heavily saturated regions in the sequence. In this paper, we propose an approach that handles both the issues mentioned above. We achieve robustness to motion (both object and camera) and saturation via an energy minimization formulation with spatio-temporal constraints. The proposed approach leverages information from the neighborhood of heavily saturated regions to correct such regions. The experimental results demonstrate the superiority of our method over state-of-the-art techniques for a variety of challenging dynamic scenes.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"19 1","pages":"57:1-57:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75306389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video stabilization by procrustes analysis of trajectories","authors":"Geethu Miriam Jacob, Sukhendu Das","doi":"10.1145/3009977.3009989","DOIUrl":"https://doi.org/10.1145/3009977.3009989","url":null,"abstract":"Video Stabilization algorithms are often necessary at the pre-processing stage for many applications in video analytics. The major challenges in video stabilization are the presence of jittery motion paths of a camera, large foreground moving objects with arbitrary motion and occlusions. In this paper, a simple, yet powerful video stabilization algorithm is proposed, by eliminating the trajectories with higher dynamism appearing due to jitter. A block-wise stabilization of the camera motion is performed, by analyzing the trajectories in Kendall's shape space. A 3-stage iterative process is proposed for each block of frames. At the first stage of the iterative process, the trajectories with relatively higher dynamism (estimated using optical flow) are eliminated. At the second stage, a Procrustes alignment is performed on the remaining trajectories and Frechet mean of the aligned trajectories is estimated. Finally, the Frechet mean is stabilized and a transformation of the stabilized Frechet mean to the original space (of the trajectories) yields the stabilized trajectories. A global optimization function has been designed for stabilization, thus minimizing wobbles and distortions in the frames. As the motion paths of the higher and lower dynamic regions become more distinct after stabilization, this iterative process helps in the identification of the stabilized background trajectories (with lower dynamism), which are used to warp the frames for rendering the stabilized frames. Experiments are done with varying levels of jitter introduced on stable videos, apart from a few benchmarked natural jittery videos. In cases, where synthetic jitter is fused on stable videos, an error norm comparing the groundtruth scores (scores of the stable videos) to the scores of the stabilized videos, is used for comparative study of performance. The results show the superiority of our proposed method over other state-of-the-art methods.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"68 1","pages":"47:1-47:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74305912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Blind image quality assessment using subspace alignment","authors":"I. Kiran, T. Guha, Gaurav Pandey","doi":"10.1145/3009977.3010014","DOIUrl":"https://doi.org/10.1145/3009977.3010014","url":null,"abstract":"This paper addresses the problem of estimating the quality of an image as it would be perceived by a human. A well accepted approach to assess perceptual quality of an image is to quantify its loss of structural information. We propose a blind image quality assessment method that aims at quantifying structural information loss in a given (possibly distorted) image by comparing its structures with those extracted from a database of clean images. We first construct a subspace from the clean natural images using (i) principal component analysis (PCA), and (ii) overcomplete dictionary learning with sparsity constraint. While PCA provides mathematical convenience, an overcomplete dictionary is known to capture the perceptually important structures resembling the simple cells in the primary visual cortex. The subspace learned from the clean images is called the source subspace. Similarly, a subspace, called the target subspace, is learned from the distorted image. In order to quantify the structural information loss, we use a subspace alignment technique which transforms the target subspace into the source by optimizing over a transformation matrix. This transformation matrix is subsequently used to measure the global and local (patch-based) quality score of the distorted image. The quality scores obtained by the proposed method are shown to correlate well with the subjective scores obtained from human annotators. Our method achieves competitive results when evaluated on three benchmark databases.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"8 1","pages":"91:1-91:6"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86145755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive artistic stylization of images","authors":"Ameya Deshpande, S. Raman","doi":"10.1145/3009977.3009985","DOIUrl":"https://doi.org/10.1145/3009977.3009985","url":null,"abstract":"In this work, we present a novel non-photorealistic rendering method which produces good quality stylization results for color images. The procedure is driven by saliency measure in the foreground and the background region. We start with generating saliency map and simple thresholding based segmentation to get rough estimation of the foreground-background mask. We improve this mask by using a scribble-based method where the scribbles for foreground-background regions are automatically generated from the previous rough estimation. Followed by the mask generation, we proceed with an iterative abstraction process which involves edge-preserving blurring and edge detection. The number of iterations of the abstraction process to be performed in the foreground and background regions are decided by tracking the changes in saliency measure in the foreground and the background regions. Performing unequal number of iterations helps to improve the average saliency measure in more salient region (foreground) while decreasing the average saliency measure in the non-salient region (background). Implementation results of our method shows the merits of this approach with other competing methods.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"68 1","pages":"3:1-3:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72807842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A large scale dataset for classification of vehicles in urban traffic scenes","authors":"H. S. Bharadwaj, S. Biswas, K. Ramakrishnan","doi":"10.1145/3009977.3010040","DOIUrl":"https://doi.org/10.1145/3009977.3010040","url":null,"abstract":"Vehicle Classification has been a well-researched topic in the recent past. However, advances in the field have not been corroborated with deployment in Intelligent Traffic Management, due to non-availability of surveillance quality visual data of vehicles in urban traffic junctions. In this paper, we present a dataset aimed at exploring Vehicle Classification and related problems in dense, urban traffic scenarios. We present our on-going effort of collecting a large scale, surveillance quality, dataset of vehicles seen mostly on Indian roads. The dataset is an extensive collection of vehicles under different poses, scales and illumination conditions in addition to a smaller set of Near Infrared spectrum images for night time and low light traffic surveillance. We will make the dataset available for further research in this area. We propose and evaluate few baseline algorithms for the task of vehicle classification on this dataset. We also discuss challenges and potential applications of the data.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"16 1","pages":"83:1-83:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78611748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Depth estimation from single image using machine learning techniques","authors":"Nidhi Chahal, Meghna Pippal, S. Chaudhury","doi":"10.1145/3009977.3010019","DOIUrl":"https://doi.org/10.1145/3009977.3010019","url":null,"abstract":"In this paper, the problem of depth estimation from single monocular image is considered. The depth cues such as motion, stereo correspondences are not present in single image which makes the task more challenging. We propose a machine learning based approach for extracting depth information from single image. The deep learning is used for extracting features, then, initial depths are generated using manifold learning in which neighborhood preserving embedding algorithm is used. Then, fixed point supervised learning is applied for sequential labeling to obtain more consistent and accurate depth maps. The features used are initial depths obtained from manifold learning and various image based features including texture, color and edges which provide useful information about depth. A fixed point contraction mapping function is generated using which depth map is predicted for new structured input image. The transfer learning approach is also used for improvement in learning in a new task through the transfer of knowledge from a related task that has already been learned. The predicted depth maps are reliable, accurate and very close to ground truth depths which is validated using objective measures: RMSE, PSNR, SSIM and subjective measure: MOS score.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"87 1","pages":"19:1-19:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76825275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"COMA-boost: co-operative multi agent AdaBoost","authors":"A. Lahiri, Biswajit Paria, P. Biswas","doi":"10.1145/3009977.3009997","DOIUrl":"https://doi.org/10.1145/3009977.3009997","url":null,"abstract":"Multi feature space representation is a common practise in computer vision applications. Traditional features such as HOG, SIFT, SURF etc., individually encapsulates certain discriminative cues for visual classification. On the other hand, each layer of a deep neural network generates multi ordered representations. In this paper we present a novel approach for such multi feature representation learning using Adaptive Boosting (AdaBoost). General practise in AdaBoost [8] is to concatenate components of feature spaces and train base learners to classify examples as correctly/incorrectly classified. We posit that multi feature space learning should be viewed as a derivative of cooperative multi agent learning. To this end, we propose a mathematical framework to leverage performance of base learners over each feature space, gauge a measure of \"difficulty\" of training space and finally make soft weight updates rather than strict binary weight updates prevalent in regular AdaBoost. This is made possible by periodically sharing of response states by our learner agents in the boosting framework. Theoretically, such soft weight update policy allows infinite combinations of weight updates on training space compared to only two possibilities in AdaBoost. This opens up the opportunity to identify 'more difficult' examples compared to 'less difficult' examples. We test our model on traditional multi feature representation of MNIST handwritten character dataset and 100-Leaves classification challenge. We consistently outperform traditional and variants of multi view boosting in terms of accuracy while margin analysis reveals that proposed method fosters formation of more confident ensemble of learner agents. As an application of using our model in conjecture with deep neural network, we test our model on the challenging task of retinal blood vessel segmentation from fundus images of DRIVE dataset by using kernel dictionaries from layers of unsupervised trained stacked autoencoder network. Our work opens a new avenue of research for combining a popular statistical machine learning paradigm with deep network architectures.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"4 1","pages":"43:1-43:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79847842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Maheshwari, P. Rai, Gopal Sharma, Vineet Gandhi
{"title":"Document blur detection using edge profile mining","authors":"S. Maheshwari, P. Rai, Gopal Sharma, Vineet Gandhi","doi":"10.1145/3009977.3009982","DOIUrl":"https://doi.org/10.1145/3009977.3009982","url":null,"abstract":"We present an algorithm for automatic blur detection of document images using a novel approach based on edge intensity profiles. Our main insight is that the edge profiles are a strong indicator of the blur present in the image, with steep profiles implying sharper regions and gradual profiles implying blurred regions. Our approach first retrieves the profiles for each point of intensity transition (each edge point) along the gradient and then uses them to output a quantitative measure indicating the extent of blur in the input image. The real time performance of the proposed approach makes it suitable for most applications. Additionally, our method works for both hand written and digital documents and is agnostic to the font types and sizes, which gives it a major advantage over the currently prevalent learning based approaches. Extensive quantitative and qualitative experiments over two different datasets show that our method outperforms almost all algorithms in current state of the art by a significant margin, especially in cross dataset experiments.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"45 1","pages":"23:1-23:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90632788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}