{"title":"一种新的基于边界框的语义分割伪标注生成方法","authors":"Xiaolong Xu, Fanman Meng, Hongliang Li, Q. Wu, King Ngi Ngan, Shuai Chen","doi":"10.1109/VCIP49819.2020.9301833","DOIUrl":null,"url":null,"abstract":"This paper proposes a fusion-based method to generate pseudo-annotations from bounding boxes for semantic segmentation. The idea is to first generate diverse foreground masks by multiple bounding box segmentation methods, and then combine these masks to generate pseudo-annotations. Existing methods generate foreground masks from bounding boxes by classical segmentation methods driving by low-level features and own local information, which is hard to generate accurate and diverse results for the fusion. Different from the traditional methods, multiple class-agnostic models are modeled to learn the objectiveness cues by using existing labeled pixel-level annotations and then to fuse. Firstly, the classical Fully Convolutional Network (FCN) that densely predicts the pixels’ labels is used. Then, two new sparse prediction based class-agnostic models are proposed, which simplify the segmentation task as sparsely predicting the boundary points through predicting the distance from the bounding box border to the object boundary in Cartesian Coordinate System and the Polar Coordinate System, respectively. Finally, a voting-based strategy is proposed to combine these segmentation results to form better pseudo-annotations. We conduct experiments on PASCAL VOC 2012 dataset. The mIoU of the proposed method is 68.7%, which outperforms the state-of-the-art method by 1.9%.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A New Bounding Box based Pseudo Annotation Generation Method for Semantic Segmentation\",\"authors\":\"Xiaolong Xu, Fanman Meng, Hongliang Li, Q. Wu, King Ngi Ngan, Shuai Chen\",\"doi\":\"10.1109/VCIP49819.2020.9301833\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a fusion-based method to generate pseudo-annotations from bounding boxes for semantic segmentation. The idea is to first generate diverse foreground masks by multiple bounding box segmentation methods, and then combine these masks to generate pseudo-annotations. Existing methods generate foreground masks from bounding boxes by classical segmentation methods driving by low-level features and own local information, which is hard to generate accurate and diverse results for the fusion. Different from the traditional methods, multiple class-agnostic models are modeled to learn the objectiveness cues by using existing labeled pixel-level annotations and then to fuse. Firstly, the classical Fully Convolutional Network (FCN) that densely predicts the pixels’ labels is used. Then, two new sparse prediction based class-agnostic models are proposed, which simplify the segmentation task as sparsely predicting the boundary points through predicting the distance from the bounding box border to the object boundary in Cartesian Coordinate System and the Polar Coordinate System, respectively. Finally, a voting-based strategy is proposed to combine these segmentation results to form better pseudo-annotations. We conduct experiments on PASCAL VOC 2012 dataset. The mIoU of the proposed method is 68.7%, which outperforms the state-of-the-art method by 1.9%.\",\"PeriodicalId\":431880,\"journal\":{\"name\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP49819.2020.9301833\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP49819.2020.9301833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A New Bounding Box based Pseudo Annotation Generation Method for Semantic Segmentation
This paper proposes a fusion-based method to generate pseudo-annotations from bounding boxes for semantic segmentation. The idea is to first generate diverse foreground masks by multiple bounding box segmentation methods, and then combine these masks to generate pseudo-annotations. Existing methods generate foreground masks from bounding boxes by classical segmentation methods driving by low-level features and own local information, which is hard to generate accurate and diverse results for the fusion. Different from the traditional methods, multiple class-agnostic models are modeled to learn the objectiveness cues by using existing labeled pixel-level annotations and then to fuse. Firstly, the classical Fully Convolutional Network (FCN) that densely predicts the pixels’ labels is used. Then, two new sparse prediction based class-agnostic models are proposed, which simplify the segmentation task as sparsely predicting the boundary points through predicting the distance from the bounding box border to the object boundary in Cartesian Coordinate System and the Polar Coordinate System, respectively. Finally, a voting-based strategy is proposed to combine these segmentation results to form better pseudo-annotations. We conduct experiments on PASCAL VOC 2012 dataset. The mIoU of the proposed method is 68.7%, which outperforms the state-of-the-art method by 1.9%.