{"title":"Automated brain tractography segmentation using curvature points","authors":"Vedang Patel, Anand Parmar, A. Bhavsar, A. Nigam","doi":"10.1145/3009977.3010013","DOIUrl":"https://doi.org/10.1145/3009977.3010013","url":null,"abstract":"Classification of brain fiber tracts is an important problem in brain tractography analysis. We propose a supervised algorithm which learns features for anatomically meaningful fiber clusters, from labeled DTI white matter data. The classification is performed at two levels: a) Grey vs White matter (macro level) and b) White matter clusters (micro level). Our approach focuses on high curvature points in the fiber tracts, which embodies the unique characteristics of the respective classes. Any test fiber is classified into one of these learned classes by comparing proximity using the learned curvature-point model (for micro level) and with a neural network classifier (at macro level). The proposed algorithm has been validated with brain DTI data for three subjects containing about 2,50,000 fibers per subject, and is shown to yield high classification accuracy (> 93%) at both macro and micro levels.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"25 1","pages":"18:1-18:6"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76641517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical structured learning for indoor autonomous navigation of Quadcopter","authors":"Vishakh Duggal, K. Bipin, Utsav Shah, K. Krishna","doi":"10.1145/3009977.3009990","DOIUrl":"https://doi.org/10.1145/3009977.3009990","url":null,"abstract":"Autonomous navigation of generic monocular quadcopter in the indoor environment requires sophisticated approaches for perception, planning and control. This paper presents a system which enables a miniature quadcopter with a frontal monocular camera to autonomously navigate and explore the unknown indoor environment. Initially, the system estimates dense depth map of the environment from a single video frame using our proposed novel supervised Hierarchical Structured Learning (hsl) technique, which yields both high accuracy levels and better generalization. The proposed hsl approach discretizes the overall depth range into multiple sets. It structures these sets hierarchically and recursively through partitioning the set of classes into two subsets with subsets representing apportioned depth range of the parent set, forming a binary tree. The binary classification method is applied to each internal node of binary tree separately using Support Vector Machine (svm). Whereas, the depth estimation of each pixel of the image starts from the root node in top-down approach, classifying repetitively till it reaches any of the leaf node representing its estimated depth. The generated depth map is provided as an input to Convolutional Neural Network (cnn), which generates flight planning commands. Finally, trajectory planning and control module employs a convex programming technique to generate collision-free minimum time trajectory which follows these flight planning commands and produces appropriate control inputs for the quadcopter. The results convey unequivocally the advantages of depth perception by hsl, while repeatable flights of successful nature in typical indoor corridors confirm the efficacy of the pipeline.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"1 1","pages":"13:1-13:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76462407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Souradeep Chakraborty, Jogendra Nath Kundu, R. Venkatesh Babu
{"title":"Deep image inpainting with region prediction at hierarchical scales","authors":"Souradeep Chakraborty, Jogendra Nath Kundu, R. Venkatesh Babu","doi":"10.1145/3009977.3009992","DOIUrl":"https://doi.org/10.1145/3009977.3009992","url":null,"abstract":"In this paper, we propose a CNN based method for image inpainting, which utilizes the inpaintings generated at different hierarchical resolutions. Firstly, we begin with the prediction of the missing image region with larger contextual information at the lowest resolution using deconv layers. Next, we refine the predicted region at greater hierarchical scales by imposing gradually reduced contextual information surrounding the predicted region by training different CNNs. Thus, our method not only utilizes information from different hierarchical resolutions but also intelligently leverages the context information at different hierarchy to produce better inpainted image. The individual models are trained jointly, using loss functions placed at intermediate layers. Finally, the CNN generated image region is sharpened using the unsharp masking operation, followed by intensity matching with the contextual region, to produce visually consistent and appealing inpaintings with more prominent edges. Comparison of our method with well-known inpainting methods, on the Caltech 101 objects dataset, demonstrates the quantitative and qualitative strengths of our method over the others.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"60 1","pages":"33:1-33:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81366487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Limitations with measuring performance of techniques for abnormality localization in surveillance video and how to overcome them?","authors":"M. Sharma, S. Sarcar, D. Sheet, P. Biswas","doi":"10.1145/3009977.3010044","DOIUrl":"https://doi.org/10.1145/3009977.3010044","url":null,"abstract":"Now a days video surveillance is becoming more popular due to global security concerns and with the increasing need for effective monitoring of public places. The key goal of video surveillance is to detect suspicious or abnormal behavior. Various efforts have been made to detect an abnormality in the video. Further to these advancements, there is a need for better techniques for evaluation of abnormality localization in video surveillance. Existing technique mainly uses forty percent overlap rule with ground-truth data, and does not considers the extra predicted region into the computation. Existing metrics have been found to be inaccurate when more than one region is present within the frame which may or may not be correctly localized or marked as abnormal. This work attempts to bridge these limitations in existing metrics. In this paper, we investigate three existing metrics and discuss their benefits and limitations for evaluating localization of abnormality in video. We further extend the existing work by introducing penalty functions and substantiate the validity of proposed metrics with a sufficient number of instances. The presented metric are validated on data (35 different situations) for which the overlap has been computed analytically.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"3 1","pages":"75:1-75:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82196394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reinforced random forest","authors":"Angshuman Paul, D. Mukherjee","doi":"10.1145/3009977.3010003","DOIUrl":"https://doi.org/10.1145/3009977.3010003","url":null,"abstract":"Reinforcement learning improves classification accuracy. But use of reinforcement learning is relatively unexplored in case of random forest classifier. We propose a reinforced random forest (RRF) classifier that exploits reinforcement learning to improve classification accuracy. Our algorithm is initialized with a forest. Then the entire training data is tested using the initial forest. In order to reinforce learning, we use mis-classified data points to grow certain number of new trees. A subset of the new trees is added to the existing forest using a novel graph-based approach. We show that addition of these trees ensures improvement in classification accuracy. This process is continued iteratively until classification accuracy saturates. The proposed RRF has low computational burden. We achieve at least 3% improvement in F-measure compared to random forest in three breast cancer datasets. Results on benchmark datasets show significant reduction in average classification error.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"56 1","pages":"1:1-1:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82268802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spectral decomposition and progressive reconstruction of scalar volumes","authors":"Uddipan Mukherjee","doi":"10.1145/3009977.3010017","DOIUrl":"https://doi.org/10.1145/3009977.3010017","url":null,"abstract":"Modern 3D imaging technologies often generate large scale volume datasets that may be represented as 3-way tensors. These volume datasets are usually compressed for compact storage, and interactive visual analysis of the data warrants efficient decompression techniques at real time. Using well known tensor decomposition techniques like CP or Tucker decomposition the volume data can be represented by a few basis vectors, the number of such vectors, called the rank of the tensor, determining the visual quality. However, in such methods, the basis vectors used between successive ranks are completely different, thereby requiring a complete recomputation of basis vectors whenever the visual quality needs to be altered. In this work, a new progressive decomposition technique is introduced for scalar volumes wherein new basis vectors are added to the already existing lower rank basis vectors. Large scale datasets are usually divided into bricks of smaller size and each such brick is represented in a compressed form. The bases used for the different bricks are data dependent and are completely different from one another. The decomposition method introduced here uses the same basis vectors for all the bricks at all hierarchical levels of detail. The basis vectors are data independent thereby minimizing storage and allowing fast data reconstruction.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"56 1","pages":"31:1-31:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82888717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Event recognition in egocentric videos using a novel trajectory based feature","authors":"Vinodh Buddubariki, Sunitha Gowd Tulluri, Snehasis Mukherjee","doi":"10.1145/3009977.3010011","DOIUrl":"https://doi.org/10.1145/3009977.3010011","url":null,"abstract":"This paper proposes an approach for event recognition in Egocentric videos using dense trajectories over Gradient Flow - Space Time Interest Point (GF-STIP) feature. We focus on recognizing events of diverse categories (including indoor and outdoor activities, sports and social activities and adventures) in egocentric videos. We introduce a dataset with diverse egocentric events, as all the existing egocentric activity recognition datasets consist of indoor videos only. The dataset introduced in this paper contains 102 videos with 9 different events (containing indoor and outdoor videos with varying lighting conditions). We extract Space Time Interest Points (STIP) from each frame of the video. The interest points are taken as the lead pixels and Gradient-Weighted Optical Flow (GWOF) features are calculated on the lead pixels by multiplying the optical flow measure and the magnitude of gradient at the pixel, to obtain the GF-STIP feature. We construct pose descriptors with the GF-STIP feature. We use the GF-STIP descriptors for recognizing events in egocentric videos with three different approaches: following a Bag of Words (BoW) model, implementing Fisher Vectors and obtaining dense trajectories for the videos. We show that the dense trajectory features based on the proposed GF-STIP descriptors enhance the efficacy of the event recognition system in egocentric videos.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"82 1","pages":"76:1-76:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83921090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pudi Raj Bhagath, J. Mukherjee, Sudipta Mukopadhayay
{"title":"Low complexity encoder for feedback-channel-free distributed video coding using deep convolutional neural networks at the decoder","authors":"Pudi Raj Bhagath, J. Mukherjee, Sudipta Mukopadhayay","doi":"10.1145/3009977.3009986","DOIUrl":"https://doi.org/10.1145/3009977.3009986","url":null,"abstract":"We propose a very low complexity encoder for feedback-channel-free distributed video coding (DVC) applications using deep convolutional neural network (CNN) at the decoder side. Deep CNN on super resolution uses low resolution (LR) images with 25% pixels information of high resolution (HR) image to super resolve it by the factor 2. Instead we train the network with 50% of noisy Wyner-Ziv (WZ) pixels to get full original WZ frame. So at the decoder, deep CNN reconstructs the original WZ image from 50% noisy WZ pixels. These noisy samples are obtained from the iterative algorithm called DLRTex. At the encoder side we compute local rank transform (LRT) of WZ frames for alternate pixels instead of all to reduce bit rate and complexity. These local rank transformed values are merged and their rank positions in the WZ frame are entropy coded using MQ-coder. In addition, average intensity values of each block of WZ frame are also transmitted to assist motion estimation. At the decoder, side information (SI) is generated by implementing motion estimation and compensation in LRT domain. The DLRTex algorithm is executed on SI using LRT to get the 50% noisy WZ pixels which are used in reconstructing full WZ frame. We compare our results with pixel domain DVC approaches and show that the coding efficiency of our codec is better than pixel domain distributed video coders based on low-density parity check and accumulate (LDPCA) or turbo codes. We also derive the complexity of our encoder interms of number of operations and prove that its complexity is very less compared to the LDPCA based methods.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"15 4 1","pages":"44:1-44:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91263863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection and segmentation of mirror-like surfaces using structured illumination","authors":"R. Aggarwal, A. Namboodiri","doi":"10.1145/3009977.3010020","DOIUrl":"https://doi.org/10.1145/3009977.3010020","url":null,"abstract":"In computer vision, many active illumination techniques employ Projector-Camera systems to extract useful information from the scenes. Known illumination patterns are projected onto the scene and their deformations in the captured images are then analyzed. We observe that the local frequencies in the captured pattern for the mirror-like surfaces is different from the projected pattern. This property allows us to design a custom Projector-Camera system to segment mirror-like surfaces by analyzing the local frequencies in the captured images. The system projects a sinusoidal pattern and capture the images from projector's point of view. We present segmentation results for the scenes including multiple reflections and inter-reflections from the mirror-like surfaces. The method can further be used in the separation of direct and global components for the mirror-like surfaces by illuminating the non-mirror-like objects separately. We show how our method is also useful for accurate estimation of shape of the non-mirror-like regions in the presence of mirror-like regions in a scene.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"448 1","pages":"66:1-66:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86856550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient adaptive weighted minimization for compressed sensing magnetic resonance image reconstruction","authors":"S. Datta, B. Deka","doi":"10.1145/3009977.3009991","DOIUrl":"https://doi.org/10.1145/3009977.3009991","url":null,"abstract":"Compressed sensing magnetic resonance imaging (CSMRI) have demonstrated that it is possible to accelerate MRI scan time by reducing the number of measurements in the k-space without significant loss of anatomical details. The number of k-space measurements is roughly proportional to the sparsity of the MR signal under consideration. Recently, a few works on CSMRI have revealed that the sparsity of the MR signal can be enhanced by suitable weighting of different regularization priors. In this paper, we have proposed an efficient adaptive weighted reconstruction algorithm for the enhancement of sparsity of the MR image. Experimental results show that the proposed algorithm gives better reconstructions with less number of measurements without significant increase of the computational time compared to existing algorithms in this line.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"10 1","pages":"95:1-95:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91086977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}