Attention-guided LiDAR segmentation and odometry using image-to-point cloud saliency transfer

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Systems Pub Date : 2024-06-24 DOI:10.1007/s00530-024-01389-7

Guanqun Ding, Nevrez İmamoğlu, Ali Caglayan, Masahiro Murakawa, Ryosuke Nakamura

{"title":"Attention-guided LiDAR segmentation and odometry using image-to-point cloud saliency transfer","authors":"Guanqun Ding, Nevrez İmamoğlu, Ali Caglayan, Masahiro Murakawa, Ryosuke Nakamura","doi":"10.1007/s00530-024-01389-7","DOIUrl":null,"url":null,"abstract":"<p>LiDAR odometry estimation and 3D semantic segmentation are crucial for autonomous driving, which has achieved remarkable advances recently. However, these tasks are challenging due to the imbalance of points in different semantic categories for 3D semantic segmentation and the influence of dynamic objects for LiDAR odometry estimation, which increases the importance of using representative/salient landmarks as reference points for robust feature learning. To address these challenges, we propose a saliency-guided approach that leverages attention information to improve the performance of LiDAR odometry estimation and semantic segmentation models. Unlike in the image domain, only a few studies have addressed point cloud saliency information due to the lack of annotated training data. To alleviate this, we first present a universal framework to transfer saliency distribution knowledge from color images to point clouds, and use this to construct a pseudo-saliency dataset (i.e. FordSaliency) for point clouds. Then, we adopt point cloud based backbones to learn saliency distribution from pseudo-saliency labels, which is followed by our proposed SalLiDAR module. SalLiDAR is a saliency-guided 3D semantic segmentation model that integrates saliency information to improve segmentation performance. Finally, we introduce SalLONet, a self-supervised saliency-guided LiDAR odometry network that uses the semantic and saliency predictions of SalLiDAR to achieve better odometry estimation. Our extensive experiments on benchmark datasets demonstrate that the proposed SalLiDAR and SalLONet models achieve state-of-the-art performance against existing methods, highlighting the effectiveness of image-to-LiDAR saliency knowledge transfer. Source code will be available at https://github.com/nevrez/SalLONet</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"177 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01389-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

LiDAR odometry estimation and 3D semantic segmentation are crucial for autonomous driving, which has achieved remarkable advances recently. However, these tasks are challenging due to the imbalance of points in different semantic categories for 3D semantic segmentation and the influence of dynamic objects for LiDAR odometry estimation, which increases the importance of using representative/salient landmarks as reference points for robust feature learning. To address these challenges, we propose a saliency-guided approach that leverages attention information to improve the performance of LiDAR odometry estimation and semantic segmentation models. Unlike in the image domain, only a few studies have addressed point cloud saliency information due to the lack of annotated training data. To alleviate this, we first present a universal framework to transfer saliency distribution knowledge from color images to point clouds, and use this to construct a pseudo-saliency dataset (i.e. FordSaliency) for point clouds. Then, we adopt point cloud based backbones to learn saliency distribution from pseudo-saliency labels, which is followed by our proposed SalLiDAR module. SalLiDAR is a saliency-guided 3D semantic segmentation model that integrates saliency information to improve segmentation performance. Finally, we introduce SalLONet, a self-supervised saliency-guided LiDAR odometry network that uses the semantic and saliency predictions of SalLiDAR to achieve better odometry estimation. Our extensive experiments on benchmark datasets demonstrate that the proposed SalLiDAR and SalLONet models achieve state-of-the-art performance against existing methods, highlighting the effectiveness of image-to-LiDAR saliency knowledge transfer. Source code will be available at https://github.com/nevrez/SalLONet

Abstract Image

查看原文本刊更多论文

利用图像到点云显著性转移进行注意力引导的激光雷达分割和里程测量

激光雷达测距估计和三维语义分割对自动驾驶至关重要，自动驾驶最近取得了显著进展。然而，由于三维语义分割中不同语义类别的点不平衡，以及激光雷达里程估算中动态物体的影响，这些任务都具有挑战性，这增加了使用代表性/显著性地标作为参考点进行稳健特征学习的重要性。为了应对这些挑战，我们提出了一种显著性引导方法，利用注意力信息来提高激光雷达里程估算和语义分割模型的性能。与图像领域不同，由于缺乏注释训练数据，只有少数研究涉及点云突出信息。为了缓解这一问题，我们首先提出了一个通用框架，将显著性分布知识从彩色图像转移到点云，并以此为点云构建了一个伪显著性数据集（即 FordSaliency）。然后，我们采用基于点云的骨干技术，从伪锯齿标签中学习显著性分布，并在此基础上开发了我们提出的 SalLiDAR 模块。SalLiDAR 是一个以显著性为导向的三维语义分割模型，它整合了显著性信息以提高分割性能。最后，我们介绍了 SalLONet，这是一个自监督式的显著性引导激光雷达测距网络，它利用 SalLiDAR 的语义和显著性预测来实现更好的测距估计。我们在基准数据集上进行的大量实验表明，与现有方法相比，所提出的 SalLiDAR 和 SalLONet 模型达到了最先进的性能，突出了图像到激光雷达突出度知识转移的有效性。源代码可从 https://github.com/nevrez/SalLONet 获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Multimedia Systems 工程技术-计算机：理论方法

CiteScore

5.40

自引率

7.70%

发文量

148

审稿时长

4.5 months

期刊介绍： This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.