A New Bounding Box based Pseudo Annotation Generation Method for Semantic Segmentation

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI:10.1109/VCIP49819.2020.9301833

Xiaolong Xu, Fanman Meng, Hongliang Li, Q. Wu, King Ngi Ngan, Shuai Chen

{"title":"A New Bounding Box based Pseudo Annotation Generation Method for Semantic Segmentation","authors":"Xiaolong Xu, Fanman Meng, Hongliang Li, Q. Wu, King Ngi Ngan, Shuai Chen","doi":"10.1109/VCIP49819.2020.9301833","DOIUrl":null,"url":null,"abstract":"This paper proposes a fusion-based method to generate pseudo-annotations from bounding boxes for semantic segmentation. The idea is to first generate diverse foreground masks by multiple bounding box segmentation methods, and then combine these masks to generate pseudo-annotations. Existing methods generate foreground masks from bounding boxes by classical segmentation methods driving by low-level features and own local information, which is hard to generate accurate and diverse results for the fusion. Different from the traditional methods, multiple class-agnostic models are modeled to learn the objectiveness cues by using existing labeled pixel-level annotations and then to fuse. Firstly, the classical Fully Convolutional Network (FCN) that densely predicts the pixels’ labels is used. Then, two new sparse prediction based class-agnostic models are proposed, which simplify the segmentation task as sparsely predicting the boundary points through predicting the distance from the bounding box border to the object boundary in Cartesian Coordinate System and the Polar Coordinate System, respectively. Finally, a voting-based strategy is proposed to combine these segmentation results to form better pseudo-annotations. We conduct experiments on PASCAL VOC 2012 dataset. The mIoU of the proposed method is 68.7%, which outperforms the state-of-the-art method by 1.9%.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP49819.2020.9301833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

This paper proposes a fusion-based method to generate pseudo-annotations from bounding boxes for semantic segmentation. The idea is to first generate diverse foreground masks by multiple bounding box segmentation methods, and then combine these masks to generate pseudo-annotations. Existing methods generate foreground masks from bounding boxes by classical segmentation methods driving by low-level features and own local information, which is hard to generate accurate and diverse results for the fusion. Different from the traditional methods, multiple class-agnostic models are modeled to learn the objectiveness cues by using existing labeled pixel-level annotations and then to fuse. Firstly, the classical Fully Convolutional Network (FCN) that densely predicts the pixels’ labels is used. Then, two new sparse prediction based class-agnostic models are proposed, which simplify the segmentation task as sparsely predicting the boundary points through predicting the distance from the bounding box border to the object boundary in Cartesian Coordinate System and the Polar Coordinate System, respectively. Finally, a voting-based strategy is proposed to combine these segmentation results to form better pseudo-annotations. We conduct experiments on PASCAL VOC 2012 dataset. The mIoU of the proposed method is 68.7%, which outperforms the state-of-the-art method by 1.9%.

查看原文本刊更多论文

一种新的基于边界框的语义分割伪标注生成方法

提出了一种基于融合的边界框伪标注生成方法，用于语义分割。其思想是首先通过多种边界框分割方法生成不同的前景蒙版，然后将这些蒙版组合起来生成伪注释。现有方法是利用底层特征和自身局部信息驱动的经典分割方法从边界框中生成前景蒙版，难以生成准确多样的融合结果。与传统方法不同，该方法对多个类别不可知模型进行建模，利用已有的标记像素级注释学习客观性线索，然后进行融合。首先，使用经典的全卷积网络(Fully Convolutional Network, FCN)密集预测像素的标签。然后，提出了两种新的基于稀疏预测的类不可知模型，将分割任务简化为分别在直角坐标系和极坐标系下通过预测边界框边界到目标边界的距离来稀疏预测边界点。最后，提出了一种基于投票的策略，将这些分割结果结合起来，形成更好的伪标注。我们在PASCAL VOC 2012数据集上进行实验。该方法的mIoU为68.7%，比目前最先进的方法高出1.9%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量