用于RGB-D图像语义分割的级联特征网络

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI:10.1109/ICCV.2017.147

Di Lin, Guangyong Chen, D. Cohen-Or, P. Heng, Hui Huang

{"title":"用于RGB-D图像语义分割的级联特征网络","authors":"Di Lin, Guangyong Chen, D. Cohen-Or, P. Heng, Hui Huang","doi":"10.1109/ICCV.2017.147","DOIUrl":null,"url":null,"abstract":"Fully convolutional network (FCN) has been successfully applied in semantic segmentation of scenes represented with RGB images. Images augmented with depth channel provide more understanding of the geometric information of the scene in the image. The question is how to best exploit this additional information to improve the segmentation performance.,,In this paper, we present a neural network with multiple branches for segmenting RGB-D images. Our approach is to use the available depth to split the image into layers with common visual characteristic of objects/scenes, or common “scene-resolution”. We introduce context-aware receptive field (CaRF) which provides a better control on the relevant contextual information of the learned features. Equipped with CaRF, each branch of the network semantically segments relevant similar scene-resolution, leading to a more focused domain which is easier to learn. Furthermore, our network is cascaded with features from one branch augmenting the features of adjacent branch. We show that such cascading of features enriches the contextual information of each branch and enhances the overall performance. The accuracy that our network achieves outperforms the stateof-the-art methods on two public datasets.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"2014 1","pages":"1320-1328"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"107","resultStr":"{\"title\":\"Cascaded Feature Network for Semantic Segmentation of RGB-D Images\",\"authors\":\"Di Lin, Guangyong Chen, D. Cohen-Or, P. Heng, Hui Huang\",\"doi\":\"10.1109/ICCV.2017.147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fully convolutional network (FCN) has been successfully applied in semantic segmentation of scenes represented with RGB images. Images augmented with depth channel provide more understanding of the geometric information of the scene in the image. The question is how to best exploit this additional information to improve the segmentation performance.,,In this paper, we present a neural network with multiple branches for segmenting RGB-D images. Our approach is to use the available depth to split the image into layers with common visual characteristic of objects/scenes, or common “scene-resolution”. We introduce context-aware receptive field (CaRF) which provides a better control on the relevant contextual information of the learned features. Equipped with CaRF, each branch of the network semantically segments relevant similar scene-resolution, leading to a more focused domain which is easier to learn. Furthermore, our network is cascaded with features from one branch augmenting the features of adjacent branch. We show that such cascading of features enriches the contextual information of each branch and enhances the overall performance. The accuracy that our network achieves outperforms the stateof-the-art methods on two public datasets.\",\"PeriodicalId\":6559,\"journal\":{\"name\":\"2017 IEEE International Conference on Computer Vision (ICCV)\",\"volume\":\"2014 1\",\"pages\":\"1320-1328\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"107\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Computer Vision (ICCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2017.147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2017.147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 107

摘要

全卷积网络(FCN)已成功应用于RGB图像场景的语义分割。深度通道增强的图像可以更好地理解图像中场景的几何信息。问题是如何最好地利用这些附加信息来提高分割性能。在本文中，我们提出了一个多分支的神经网络分割RGB-D图像。我们的方法是使用可用的深度将图像分成具有物体/场景共同视觉特征的层，或共同的“场景分辨率”。我们引入了上下文感知接受场(CaRF)，它可以更好地控制学习特征的相关上下文信息。配备CaRF后，网络的每个分支在语义上分割相关的相似场景分辨率，从而形成更集中的领域，更容易学习。此外，我们的网络是级联的，从一个分支的特征增强相邻分支的特征。我们表明，这种层叠特征丰富了每个分支的上下文信息，提高了整体性能。我们的网络在两个公共数据集上实现的准确性优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cascaded Feature Network for Semantic Segmentation of RGB-D Images

Fully convolutional network (FCN) has been successfully applied in semantic segmentation of scenes represented with RGB images. Images augmented with depth channel provide more understanding of the geometric information of the scene in the image. The question is how to best exploit this additional information to improve the segmentation performance.,,In this paper, we present a neural network with multiple branches for segmenting RGB-D images. Our approach is to use the available depth to split the image into layers with common visual characteristic of objects/scenes, or common “scene-resolution”. We introduce context-aware receptive field (CaRF) which provides a better control on the relevant contextual information of the learned features. Equipped with CaRF, each branch of the network semantically segments relevant similar scene-resolution, leading to a more focused domain which is easier to learn. Furthermore, our network is cascaded with features from one branch augmenting the features of adjacent branch. We show that such cascading of features enriches the contextual information of each branch and enhances the overall performance. The accuracy that our network achieves outperforms the stateof-the-art methods on two public datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量