Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing最新文献

筛选
英文 中文
Hierarchical structured learning for indoor autonomous navigation of Quadcopter 四轴飞行器室内自主导航的分层结构学习
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3009990
Vishakh Duggal, K. Bipin, Utsav Shah, K. Krishna
{"title":"Hierarchical structured learning for indoor autonomous navigation of Quadcopter","authors":"Vishakh Duggal, K. Bipin, Utsav Shah, K. Krishna","doi":"10.1145/3009977.3009990","DOIUrl":"https://doi.org/10.1145/3009977.3009990","url":null,"abstract":"Autonomous navigation of generic monocular quadcopter in the indoor environment requires sophisticated approaches for perception, planning and control. This paper presents a system which enables a miniature quadcopter with a frontal monocular camera to autonomously navigate and explore the unknown indoor environment. Initially, the system estimates dense depth map of the environment from a single video frame using our proposed novel supervised Hierarchical Structured Learning (hsl) technique, which yields both high accuracy levels and better generalization. The proposed hsl approach discretizes the overall depth range into multiple sets. It structures these sets hierarchically and recursively through partitioning the set of classes into two subsets with subsets representing apportioned depth range of the parent set, forming a binary tree. The binary classification method is applied to each internal node of binary tree separately using Support Vector Machine (svm). Whereas, the depth estimation of each pixel of the image starts from the root node in top-down approach, classifying repetitively till it reaches any of the leaf node representing its estimated depth. The generated depth map is provided as an input to Convolutional Neural Network (cnn), which generates flight planning commands. Finally, trajectory planning and control module employs a convex programming technique to generate collision-free minimum time trajectory which follows these flight planning commands and produces appropriate control inputs for the quadcopter. The results convey unequivocally the advantages of depth perception by hsl, while repeatable flights of successful nature in typical indoor corridors confirm the efficacy of the pipeline.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"1 1","pages":"13:1-13:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76462407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automatic video matting through scribble propagation 通过涂鸦传播自动视频抠图
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3009979
Bhoomika Sonane, S. Ramakrishnan, S. Raman
{"title":"Automatic video matting through scribble propagation","authors":"Bhoomika Sonane, S. Ramakrishnan, S. Raman","doi":"10.1145/3009977.3009979","DOIUrl":"https://doi.org/10.1145/3009977.3009979","url":null,"abstract":"Video matting is an extension of image matting and is used to extract the foreground matte from an arbitrary background of every frame in a video sequence. An automatic scribbling approach based on the relative motion of the foreground object with respect to the background in a video is introduced for video matting. The proposed scribble propagation and the subsequent isolation of foreground and background is much more intuitive than the conventional trimap propagation approach used for video matting. Alpha maps are propagated according to the optical flow estimated from the consecutive frames to get a preliminary estimate of the foreground and background in the following frame. Accurate scribbles are placed near the boundary of the foreground region for refining the scribbled image with the help of morphological operations. We show that a high quality matte of foreground object can be obtained using a state-of-the-art image matting technique. We show that the results obtained using the proposed method are accurate and comparable with that of other state-of-the-art video matting techniques.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"73 1","pages":"87:1-87:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90778349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards semantic visual representation: augmenting image representation with natural language descriptors 面向语义视觉表示:用自然语言描述符增强图像表示
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010010
Konda Reddy Mopuri, R. Venkatesh Babu
{"title":"Towards semantic visual representation: augmenting image representation with natural language descriptors","authors":"Konda Reddy Mopuri, R. Venkatesh Babu","doi":"10.1145/3009977.3010010","DOIUrl":"https://doi.org/10.1145/3009977.3010010","url":null,"abstract":"Learning image representations has been an interesting and challenging problem. When users upload images to photo sharing websites, they often provide multiple textual tags for ease of reference. These tags can reveal significant information about the content of the image such as the objects present in the image or the action that is taking place. Approaches have been proposed to extract additional information from these tags in order to augment the visual cues and build a multi-modal image representation. However, the existing approaches do not pay much attention to the semantic meaning of the tags while they encode. In this work, we attempt to enrich the image representation with the tag encodings that leverage their semantics. Our approach utilizes neural network based natural language descriptors to represent the tag information. By complementing the visual features learned by convnets, our approach results in an efficient multi-modal image representation. Experimental evaluation suggests that our approach results in a better multi-modal image representation by exploiting the two data modalities for classification on benchmark datasets.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"5 1","pages":"64:1-64:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72803062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Data-driven 2D effects animation 数据驱动的2D效果动画
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010000
Divya Grover, P. Chaudhuri
{"title":"Data-driven 2D effects animation","authors":"Divya Grover, P. Chaudhuri","doi":"10.1145/3009977.3010000","DOIUrl":"https://doi.org/10.1145/3009977.3010000","url":null,"abstract":"Making plausible, high quality visual effects, like water splashes or fire, in traditional 2D animation pipelines require an animator to draw many frames of phenomena that are very difficult to recreate manually. We present a technique that uses a database of video clips of such phenomena to assist the animator. The animator has to only input sample sketched frames of the phenomena at particular time instants. These are matched to frames of the video clips and a plausible sequence of frames is generated from these clips that contain the animator drawn frames as constraints. The colour style of the hand-drawn frames is used to render the generated frames, thus resulting in a 2D animation that follows the style and intent of the 2D animator. Our system can also create multi-layered effects animation, allowing the animator to draw interacting mixed phenomena, like water being poured on fire.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"os-18 1","pages":"37:1-37:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87392342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Image defencing via signal demixing 通过信号分解进行图像防御
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3009984
Veepin Kumar, J. Mukherjee, S. Mandal
{"title":"Image defencing via signal demixing","authors":"Veepin Kumar, J. Mukherjee, S. Mandal","doi":"10.1145/3009977.3009984","DOIUrl":"https://doi.org/10.1145/3009977.3009984","url":null,"abstract":"We present a novel algorithm to remove near regular, fence or wire like foreground patterns from an image. The fence detection or fence removal algorithms, developed so far, have poor performance in detecting the fence. We use signal demixing to utilize the sparsity and regularity property of fences to detect them. Results demonstrate the effectiveness of our technique as compared to other state of the art techniques.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"30 1","pages":"11:1-11:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87469128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A biologically inspired saliency model for color fundus images 彩色眼底图像的生物学启发的显著性模型
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010041
Samrudhdhi B. Rangrej, J. Sivaswamy
{"title":"A biologically inspired saliency model for color fundus images","authors":"Samrudhdhi B. Rangrej, J. Sivaswamy","doi":"10.1145/3009977.3010041","DOIUrl":"https://doi.org/10.1145/3009977.3010041","url":null,"abstract":"Saliency computation is widely studied in computer vision but not in medical imaging. Existing computational saliency models have been developed for general (natural) images and hence may not be suitable for medical images. This is due to the variety of imaging modalities and the requirement of the models to capture not only normal but also deviations from normal anatomy. We present a biologically inspired model for colour fundus images and illustrate it for the case of diabetic retinopathy. The proposed model uses spatially-varying morphological operations to enhance lesions locally and combines an ensemble of results, of such operations, to generate the saliency map. The model is validated against an average Human Gaze map of 15 experts and found to have 10% higher recall (at 100% precision) than four leading saliency models proposed for natural images. The F-score for match with manual lesion markings by 5 experts was 0.4 (as opposed to 0.532 for gaze map) for our model and very poor for existing models. The model's utility is shown via a novel enhancement method which employs saliency to selectively enhance the abnormal regions and this was found to boost their contrast to noise ratio by ∼ 30%.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"84 1","pages":"54:1-54:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83866628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep neural networks for segmentation of basal ganglia sub-structures in brain MR images 脑磁共振图像基底神经节亚结构的深度神经网络分割
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010048
Akshay Sethi, Akshat Sinha, Ayush Agarwal, Chetan Arora, Anubha Gupta
{"title":"Deep neural networks for segmentation of basal ganglia sub-structures in brain MR images","authors":"Akshay Sethi, Akshat Sinha, Ayush Agarwal, Chetan Arora, Anubha Gupta","doi":"10.1145/3009977.3010048","DOIUrl":"https://doi.org/10.1145/3009977.3010048","url":null,"abstract":"Automated segmentation of brain structure in magnetic resonance imaging (MRI) scans is an important first step in diagnosis of many neurological diseases. In this paper, we focus on segmentation of the constituent sub-structures of basal ganglia (BG) region of the brain that are responsible for controlling movement and routine learning. Low contrast voxels and undefined boundaries across sub-regions of BG pose a challenge for automated segmentation. We pose the segmentation as a voxel classification problem and propose a Deep Neural Network (DNN) based classifier for BG segmentation. The DNN is able to learn distinct regional features for voxel-wise classification of BG area into four sub-regions, namely, Caudate, Putamen, Pallidum, and Accumbens. We use a public dataset with a collection of 83 T-1 weighted uniform dimension structural MRI scans of healthy and diseased (Bipolar with and without Psychosis, Schizophrenia) subjects. In order to build a robust classifier, the proposed classifier has been trained on a mixed collection of healthy and diseased MRs. We report an accuracy of above 94% (as calculated using the dice coefficient) for all the four classes of healthy and diseased dataset.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"31 1","pages":"20:1-20:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86215236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recognizing facial expressions using novel motion based features 使用新颖的基于运动的特征识别面部表情
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010004
Snehasis Mukherjee, B. Vamshi, K. V. Sai Vineeth Kumar Reddy, Repala Vamshi Krishna, S. V. S. Harish
{"title":"Recognizing facial expressions using novel motion based features","authors":"Snehasis Mukherjee, B. Vamshi, K. V. Sai Vineeth Kumar Reddy, Repala Vamshi Krishna, S. V. S. Harish","doi":"10.1145/3009977.3010004","DOIUrl":"https://doi.org/10.1145/3009977.3010004","url":null,"abstract":"This paper introduces two novel motion based features for recognizing human facial expressions. The proposed motion features are applied for recognizing facial expressions from a video sequence. The proposed bag-of-words based scheme represents each frame of a video sequence as a vector depicting local motion patterns during a facial expression. The local motion patterns are captured by an efficient derivation from optical flow. Motion features are clustered and stored as words in a dictionary. We further generate a reduced dictionary by ranking the words based on some ambiguity measure. We prune out the ambiguous words and continue with key words in the reduced dictionary. The ambiguity measure is given by applying a graph-based technique, where each word is represented as a node in the graph. Ambiguity measures are obtained by modelling the frequency of occurrence of the word during the expression. We form expression descriptors for each expression from the reduced dictionary, by applying an efficient kernel. The training of the expression descriptors are made following an adaptive learning technique. We tested the proposed approach with standard dataset. The proposed approach shows better accuracy compared to the state-of-the-art.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"63 1","pages":"32:1-32:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76260391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Autoregressive hidden Markov model with missing data for modelling functional MR imaging data 具有缺失数据的自回归隐马尔可夫模型用于功能性磁共振成像数据的建模
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010021
Shilpa Dang, S. Chaudhury, Brejesh Lall, P. Roy
{"title":"Autoregressive hidden Markov model with missing data for modelling functional MR imaging data","authors":"Shilpa Dang, S. Chaudhury, Brejesh Lall, P. Roy","doi":"10.1145/3009977.3010021","DOIUrl":"https://doi.org/10.1145/3009977.3010021","url":null,"abstract":"Functional Magnetic Resonance Imaging (fMRI) has opened ways to look inside active human brain. However, fMRI signal is an indirect indicator of underlying neuronal activity and has low-temporal resolution due to acquisition process. This paper proposes autoregressive hidden Markov model with missing data (AR-HMM-md) framework which aims at addressing aforementioned issues while allowing accurate capturing of fMRI time series characteristics. The proposed work models unobserved neuronal activity over time as sequence of discrete hidden states, and shows how exact inference can be obtained with missing fMRI data under the \"Missing not at Random\" (MNAR) mechanism. This mechanism requires explicit modelling of the missing data along with the observed data. The performance is evaluated by observing convergence characteristic of log-likelihoods and classification capability of the proposed model over existing models for two fMRI datasets. The classification is performed between real fMRI time series from a task-based experiment and randomly-generated time series. Another classification experiment is performed between children and elder subjects using fMRI time series from resting-state data. The proposed model captured the fMRI characteristics efficiently and thus converged to better posterior probability resulting into higher classification accuracy over existing models for both the datasets.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"62 1","pages":"93:1-93:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73988031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Intrinsic image decomposition using focal stacks 利用焦点叠加进行图像的内禀分解
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010046
Saurabh Saini, P. Sakurikar, P J Narayanan
{"title":"Intrinsic image decomposition using focal stacks","authors":"Saurabh Saini, P. Sakurikar, P J Narayanan","doi":"10.1145/3009977.3010046","DOIUrl":"https://doi.org/10.1145/3009977.3010046","url":null,"abstract":"In this paper, we presents a novel method (RGBF-IID) for intrinsic image decomposition of a wild scene without any restrictions on the complexity, illumination or scale of the image. We use focal stacks of the scene as input. A focal stack captures a scene at varying focal distances. Since focus depends on distance to the object, this representation has information beyond an RGB image towards an RGBD image with depth. We call our representation an RGBF image to highlight this. We use a robust focus measure and generalized random walk algorithm to compute dense probability maps across the stack. These maps are used to define sparse local and global pixel neighbourhoods, adhering to the structure of the underlying 3D scene. We use these neighbourhood correspondences with standard chromaticity assumptions as constraints in an optimization system. We present our results on both indoor and outdoor scenes using manually captured stacks of random objects under natural as well as artificial lighting conditions. We also test our system on a larger dataset of synthetically generated focal stacks from NYUv2 and MPI Sintel datasets and show competitive performance against current state-of-the-art IID methods that use RGBD images. Our method provides a strong evidence for the potential of RGBF modality in place of RGBD in computer vision.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"49 1","pages":"88:1-88:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74161262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信