Decoupled Iterative Deep Sensor Fusion for 3D Semantic Segmentation

Int. J. Semantic Comput. Pub Date : 2021-09-01 DOI:10.1142/s1793351x21400067

Fabian Duerr, H. Weigel, J. Beyerer

{"title":"Decoupled Iterative Deep Sensor Fusion for 3D Semantic Segmentation","authors":"Fabian Duerr, H. Weigel, J. Beyerer","doi":"10.1142/s1793351x21400067","DOIUrl":null,"url":null,"abstract":"One of the key tasks for autonomous vehicles or robots is a robust perception of their 3D environment, which is why autonomous vehicles or robots are equipped with a wide range of different sensors. Building upon a robust sensor setup, understanding and interpreting their 3D environment is the next important step. Semantic segmentation of 3D sensor data, e.g. point clouds, provides valuable information for this task and is often seen as key enabler for 3D scene understanding. This work presents an iterative deep fusion architecture for semantic segmentation of 3D point clouds, which builds upon a range image representation of the point clouds and additionally exploits camera features to increase accuracy and robustness. In contrast to other approaches, which fuse lidar and camera features once, the proposed fusion strategy iteratively combines and refines lidar and camera features at different scales inside the network architecture. Additionally, the proposed approach can deal with camera failure as well as jointly predict lidar and camera segmentation. We demonstrate the benefits of the presented iterative deep fusion approach on two challenging datasets, outperforming all range image-based lidar and fusion approaches. An in-depth evaluation underlines the effectiveness of the proposed fusion strategy and the potential of camera features for 3D semantic segmentation.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Semantic Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1793351x21400067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

One of the key tasks for autonomous vehicles or robots is a robust perception of their 3D environment, which is why autonomous vehicles or robots are equipped with a wide range of different sensors. Building upon a robust sensor setup, understanding and interpreting their 3D environment is the next important step. Semantic segmentation of 3D sensor data, e.g. point clouds, provides valuable information for this task and is often seen as key enabler for 3D scene understanding. This work presents an iterative deep fusion architecture for semantic segmentation of 3D point clouds, which builds upon a range image representation of the point clouds and additionally exploits camera features to increase accuracy and robustness. In contrast to other approaches, which fuse lidar and camera features once, the proposed fusion strategy iteratively combines and refines lidar and camera features at different scales inside the network architecture. Additionally, the proposed approach can deal with camera failure as well as jointly predict lidar and camera segmentation. We demonstrate the benefits of the presented iterative deep fusion approach on two challenging datasets, outperforming all range image-based lidar and fusion approaches. An in-depth evaluation underlines the effectiveness of the proposed fusion strategy and the potential of camera features for 3D semantic segmentation.

查看原文本刊更多论文

解耦迭代深度传感器融合三维语义分割

自动驾驶汽车或机器人的关键任务之一是对其3D环境的强大感知，这就是为什么自动驾驶汽车或机器人配备了各种不同的传感器。建立在一个强大的传感器设置，理解和解释他们的3D环境是下一个重要步骤。3D传感器数据的语义分割，例如点云，为这项任务提供了有价值的信息，通常被视为3D场景理解的关键促成因素。这项工作提出了一种用于3D点云语义分割的迭代深度融合架构，该架构建立在点云的范围图像表示的基础上，并利用相机特性来提高准确性和鲁棒性。与其他方法一次融合激光雷达和摄像机特征不同，本文提出的融合策略在网络架构内迭代地结合和细化不同尺度的激光雷达和摄像机特征。此外，该方法可以处理相机故障，并可以联合预测激光雷达和相机分割。我们在两个具有挑战性的数据集上展示了所提出的迭代深度融合方法的优点，优于所有基于距离图像的激光雷达和融合方法。深入的评估强调了所提出的融合策略的有效性和相机特征在3D语义分割方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Int. J. Semantic Comput.

自引率

0.00%

发文量