Jingkuan Song, Lianli Gao, M. Puscas, F. Nie, Fumin Shen, N. Sebe
{"title":"Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration","authors":"Jingkuan Song, Lianli Gao, M. Puscas, F. Nie, Fumin Shen, N. Sebe","doi":"10.1145/2964284.2964295","DOIUrl":null,"url":null,"abstract":"Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS)}, which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2964284.2964295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24
Abstract
Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS)}, which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.