{"title":"基于图对图能量最小化的视频对象分割","authors":"Yuezun Li, Longyin Wen, Ming-Ching Chang, Siwei Lyu","doi":"10.1109/AVSS.2019.8909894","DOIUrl":null,"url":null,"abstract":"We describe a new unsupervised video object segmentation (VOS) method based on the graph-to-graph energy minimization, which focuses on exploiting the mutual bootstrapping information between bottom-up (i.e., using pixel/superpixel attributes) and top-down (i.e., using learned appearance and motion cues) processes in a uni-fiedframework. Specifically, we construct a graph-to-graph energy function to encode the spatial similarities among superpixels (superpixel-graph) and temporal consistency among regions (region-graph). An efficient heuristic iterative algorithm is used to minimize the energy function to get the optimal assignment of superpixel and region labels to complete the VOS task. Experiments on two challenging benchmarks (i.e., SegTrack v2 and DAVIS) show that the proposed method achieves favorable performance against the state-of-the-art unsupervised VOS methods and comparable performance with the state-of-the-art semi-supervised methods.","PeriodicalId":243194,"journal":{"name":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Graph-to-Graph Energy Minimization for Video Object Segmentation\",\"authors\":\"Yuezun Li, Longyin Wen, Ming-Ching Chang, Siwei Lyu\",\"doi\":\"10.1109/AVSS.2019.8909894\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We describe a new unsupervised video object segmentation (VOS) method based on the graph-to-graph energy minimization, which focuses on exploiting the mutual bootstrapping information between bottom-up (i.e., using pixel/superpixel attributes) and top-down (i.e., using learned appearance and motion cues) processes in a uni-fiedframework. Specifically, we construct a graph-to-graph energy function to encode the spatial similarities among superpixels (superpixel-graph) and temporal consistency among regions (region-graph). An efficient heuristic iterative algorithm is used to minimize the energy function to get the optimal assignment of superpixel and region labels to complete the VOS task. Experiments on two challenging benchmarks (i.e., SegTrack v2 and DAVIS) show that the proposed method achieves favorable performance against the state-of-the-art unsupervised VOS methods and comparable performance with the state-of-the-art semi-supervised methods.\",\"PeriodicalId\":243194,\"journal\":{\"name\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS.2019.8909894\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2019.8909894","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Graph-to-Graph Energy Minimization for Video Object Segmentation
We describe a new unsupervised video object segmentation (VOS) method based on the graph-to-graph energy minimization, which focuses on exploiting the mutual bootstrapping information between bottom-up (i.e., using pixel/superpixel attributes) and top-down (i.e., using learned appearance and motion cues) processes in a uni-fiedframework. Specifically, we construct a graph-to-graph energy function to encode the spatial similarities among superpixels (superpixel-graph) and temporal consistency among regions (region-graph). An efficient heuristic iterative algorithm is used to minimize the energy function to get the optimal assignment of superpixel and region labels to complete the VOS task. Experiments on two challenging benchmarks (i.e., SegTrack v2 and DAVIS) show that the proposed method achieves favorable performance against the state-of-the-art unsupervised VOS methods and comparable performance with the state-of-the-art semi-supervised methods.