RGBD Semantic Segmentation Based on Global Convolutional Network

Proceedings of the 2019 4th International Conference on Robotics, Control and Automation Pub Date : 2019-07-26 DOI:10.1145/3351180.3351182

Xiaoning Gao, Jun Yu, Jian-xun Li

{"title":"RGBD Semantic Segmentation Based on Global Convolutional Network","authors":"Xiaoning Gao, Jun Yu, Jian-xun Li","doi":"10.1145/3351180.3351182","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks have gradually dominated the field of image semantic segmentation, and have achieved good results in 2D image semantic segmentation tasks. However, the 2D semantic segmentation algorithm based on CNN is still unsatisfactory in many complex scenarios, such as indoor scenes. Fortunately, advances in depth sensor technology have made it easy to obtain depth information, which carries rich geometric structure information. In order to effectively embed the depth map into the convolutional neural network, this paper introduces the dual encoder fusion network framework to fully exploit the geometric features. For the problem of weakening the local pixel classification ability of the dual encoder fusion network, this paper introduces global convolutional network (GCN), which is based on the large kernel idea, to improve the performance of dual encoder fusion network. Extensive experiments on the NYU v2 dataset demonstrate that the two-encoder fusion network based on global convolution network has much better precision than the original fusion network, and the classification ability for local pixels is stronger.","PeriodicalId":375806,"journal":{"name":"Proceedings of the 2019 4th International Conference on Robotics, Control and Automation","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 4th International Conference on Robotics, Control and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3351180.3351182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Convolutional neural networks have gradually dominated the field of image semantic segmentation, and have achieved good results in 2D image semantic segmentation tasks. However, the 2D semantic segmentation algorithm based on CNN is still unsatisfactory in many complex scenarios, such as indoor scenes. Fortunately, advances in depth sensor technology have made it easy to obtain depth information, which carries rich geometric structure information. In order to effectively embed the depth map into the convolutional neural network, this paper introduces the dual encoder fusion network framework to fully exploit the geometric features. For the problem of weakening the local pixel classification ability of the dual encoder fusion network, this paper introduces global convolutional network (GCN), which is based on the large kernel idea, to improve the performance of dual encoder fusion network. Extensive experiments on the NYU v2 dataset demonstrate that the two-encoder fusion network based on global convolution network has much better precision than the original fusion network, and the classification ability for local pixels is stronger.

查看原文本刊更多论文

基于全局卷积网络的RGBD语义分割

卷积神经网络逐渐主导了图像语义分割领域，并在二维图像语义分割任务中取得了良好的效果。然而，基于CNN的二维语义分割算法在室内场景等许多复杂场景下仍然不尽人意。幸运的是，深度传感器技术的进步使得深度信息的获取变得容易，而深度信息承载着丰富的几何结构信息。为了有效地将深度图嵌入到卷积神经网络中，本文引入了双编码器融合网络框架，充分利用深度图的几何特征。针对双编码器融合网络局部像素分类能力减弱的问题，本文引入基于大核思想的全局卷积网络(global convolutional network, GCN)来提高双编码器融合网络的性能。在NYU v2数据集上的大量实验表明，基于全局卷积网络的双编码器融合网络比原始融合网络具有更好的精度，并且对局部像素的分类能力更强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2019 4th International Conference on Robotics, Control and Automation

自引率

0.00%

发文量