{"title":"RGBD Semantic Segmentation Based on Global Convolutional Network","authors":"Xiaoning Gao, Jun Yu, Jian-xun Li","doi":"10.1145/3351180.3351182","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks have gradually dominated the field of image semantic segmentation, and have achieved good results in 2D image semantic segmentation tasks. However, the 2D semantic segmentation algorithm based on CNN is still unsatisfactory in many complex scenarios, such as indoor scenes. Fortunately, advances in depth sensor technology have made it easy to obtain depth information, which carries rich geometric structure information. In order to effectively embed the depth map into the convolutional neural network, this paper introduces the dual encoder fusion network framework to fully exploit the geometric features. For the problem of weakening the local pixel classification ability of the dual encoder fusion network, this paper introduces global convolutional network (GCN), which is based on the large kernel idea, to improve the performance of dual encoder fusion network. Extensive experiments on the NYU v2 dataset demonstrate that the two-encoder fusion network based on global convolution network has much better precision than the original fusion network, and the classification ability for local pixels is stronger.","PeriodicalId":375806,"journal":{"name":"Proceedings of the 2019 4th International Conference on Robotics, Control and Automation","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 4th International Conference on Robotics, Control and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3351180.3351182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Convolutional neural networks have gradually dominated the field of image semantic segmentation, and have achieved good results in 2D image semantic segmentation tasks. However, the 2D semantic segmentation algorithm based on CNN is still unsatisfactory in many complex scenarios, such as indoor scenes. Fortunately, advances in depth sensor technology have made it easy to obtain depth information, which carries rich geometric structure information. In order to effectively embed the depth map into the convolutional neural network, this paper introduces the dual encoder fusion network framework to fully exploit the geometric features. For the problem of weakening the local pixel classification ability of the dual encoder fusion network, this paper introduces global convolutional network (GCN), which is based on the large kernel idea, to improve the performance of dual encoder fusion network. Extensive experiments on the NYU v2 dataset demonstrate that the two-encoder fusion network based on global convolution network has much better precision than the original fusion network, and the classification ability for local pixels is stronger.