NUC-Net:用于高效激光雷达语义分割的非均匀圆柱分割网络

IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Xuzhi Wang;Wei Feng;Lingdong Kong;Liang Wan
{"title":"NUC-Net:用于高效激光雷达语义分割的非均匀圆柱分割网络","authors":"Xuzhi Wang;Wei Feng;Lingdong Kong;Liang Wan","doi":"10.1109/TCSVT.2025.3554182","DOIUrl":null,"url":null,"abstract":"LiDAR semantic segmentation plays a vital role in autonomous driving. Existing voxel-based methods for LiDAR semantic segmentation apply uniform partition to the 3D LiDAR point cloud to form a structured representation based on cartesian/cylindrical coordinates. Although these methods show impressive performance, the drawback of existing voxel-based methods remains in two aspects: 1) it requires a large enough input voxel resolution, which brings a large amount of computation cost and memory consumption. 2) it does not well handle the unbalanced point distribution of LiDAR point cloud. In this paper, we propose a non-uniform cylindrical partition network named NUC-Net to tackle the above challenges. Specifically, we propose the Arithmetic Progression of Interval (API) method to non-uniformly partition the radial axis and generate the voxel representation which is representative and efficient. Moreover, we propose a non-uniform multi-scale aggregation method to improve contextual information. Our method achieves state-of-the-art performance on SemanticKITTI and nuScenes datasets with much faster speed and much less training time. And our method can be a general component for LiDAR semantic segmentation, which significantly improves both the accuracy and efficiency of the uniform counterpart by <inline-formula> <tex-math>$4 \\times $ </tex-math></inline-formula> training faster and <inline-formula> <tex-math>$2 \\times $ </tex-math></inline-formula> GPU memory reduction and <inline-formula> <tex-math>$3 \\times $ </tex-math></inline-formula> inference speedup. We further provide theoretical analysis towards understanding why NUC is effective and how point distribution affects performance. Code is available at <uri>https://github.com/alanWXZ/NUC-Net</uri>.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9090-9104"},"PeriodicalIF":11.1000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NUC-Net: Non-Uniform Cylindrical Partition Network for Efficient LiDAR Semantic Segmentation\",\"authors\":\"Xuzhi Wang;Wei Feng;Lingdong Kong;Liang Wan\",\"doi\":\"10.1109/TCSVT.2025.3554182\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"LiDAR semantic segmentation plays a vital role in autonomous driving. Existing voxel-based methods for LiDAR semantic segmentation apply uniform partition to the 3D LiDAR point cloud to form a structured representation based on cartesian/cylindrical coordinates. Although these methods show impressive performance, the drawback of existing voxel-based methods remains in two aspects: 1) it requires a large enough input voxel resolution, which brings a large amount of computation cost and memory consumption. 2) it does not well handle the unbalanced point distribution of LiDAR point cloud. In this paper, we propose a non-uniform cylindrical partition network named NUC-Net to tackle the above challenges. Specifically, we propose the Arithmetic Progression of Interval (API) method to non-uniformly partition the radial axis and generate the voxel representation which is representative and efficient. Moreover, we propose a non-uniform multi-scale aggregation method to improve contextual information. Our method achieves state-of-the-art performance on SemanticKITTI and nuScenes datasets with much faster speed and much less training time. And our method can be a general component for LiDAR semantic segmentation, which significantly improves both the accuracy and efficiency of the uniform counterpart by <inline-formula> <tex-math>$4 \\\\times $ </tex-math></inline-formula> training faster and <inline-formula> <tex-math>$2 \\\\times $ </tex-math></inline-formula> GPU memory reduction and <inline-formula> <tex-math>$3 \\\\times $ </tex-math></inline-formula> inference speedup. We further provide theoretical analysis towards understanding why NUC is effective and how point distribution affects performance. Code is available at <uri>https://github.com/alanWXZ/NUC-Net</uri>.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 9\",\"pages\":\"9090-9104\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2025-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10938726/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10938726/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

激光雷达语义分割在自动驾驶中起着至关重要的作用。现有的基于体素的LiDAR语义分割方法对三维LiDAR点云进行均匀分割,形成基于直角/柱坐标的结构化表示。尽管这些方法表现出了令人印象深刻的性能,但现有的基于体素的方法存在两个缺点:1)需要足够大的输入体素分辨率,这带来了大量的计算成本和内存消耗。2)不能很好地处理LiDAR点云的点分布不平衡问题。在本文中,我们提出了一个非均匀的圆柱形分区网络,命名为NUC-Net来解决上述挑战。具体来说,我们提出了区间等差级数(API)方法对径向轴进行非均匀划分,生成具有代表性和效率的体素表示。此外,我们还提出了一种非均匀多尺度聚合方法来改进上下文信息。我们的方法以更快的速度和更少的训练时间在SemanticKITTI和nuScenes数据集上实现了最先进的性能。我们的方法可以作为LiDAR语义分割的通用组件,显著提高了统一对应的准确性和效率,训练速度提高了4倍,GPU内存减少了2倍,推理速度提高了3倍。我们进一步提供理论分析,以理解为什么NUC是有效的,以及点分布如何影响性能。代码可从https://github.com/alanWXZ/NUC-Net获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
NUC-Net: Non-Uniform Cylindrical Partition Network for Efficient LiDAR Semantic Segmentation
LiDAR semantic segmentation plays a vital role in autonomous driving. Existing voxel-based methods for LiDAR semantic segmentation apply uniform partition to the 3D LiDAR point cloud to form a structured representation based on cartesian/cylindrical coordinates. Although these methods show impressive performance, the drawback of existing voxel-based methods remains in two aspects: 1) it requires a large enough input voxel resolution, which brings a large amount of computation cost and memory consumption. 2) it does not well handle the unbalanced point distribution of LiDAR point cloud. In this paper, we propose a non-uniform cylindrical partition network named NUC-Net to tackle the above challenges. Specifically, we propose the Arithmetic Progression of Interval (API) method to non-uniformly partition the radial axis and generate the voxel representation which is representative and efficient. Moreover, we propose a non-uniform multi-scale aggregation method to improve contextual information. Our method achieves state-of-the-art performance on SemanticKITTI and nuScenes datasets with much faster speed and much less training time. And our method can be a general component for LiDAR semantic segmentation, which significantly improves both the accuracy and efficiency of the uniform counterpart by $4 \times $ training faster and $2 \times $ GPU memory reduction and $3 \times $ inference speedup. We further provide theoretical analysis towards understanding why NUC is effective and how point distribution affects performance. Code is available at https://github.com/alanWXZ/NUC-Net.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
13.80
自引率
27.40%
发文量
660
审稿时长
5 months
期刊介绍: The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信