ET-PointPillars：基于优化体素下采样的改进型三维物体检测 PointPillars

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Vision and Applications Pub Date : 2024-04-21 DOI:10.1007/s00138-024-01538-y

Yiyi Liu, Zhengyi Yang, JianLin Tong, Jiajia Yang, Jiongcheng Peng, Lihang Zhang, Wangxin Cheng

{"title":"ET-PointPillars：基于优化体素下采样的改进型三维物体检测 PointPillars","authors":"Yiyi Liu, Zhengyi Yang, JianLin Tong, Jiajia Yang, Jiongcheng Peng, Lihang Zhang, Wangxin Cheng","doi":"10.1007/s00138-024-01538-y","DOIUrl":null,"url":null,"abstract":"The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a \\(2.73\\%\\) average increase for pedestrians and cyclists on KITTI.","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"101 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling\",\"authors\":\"Yiyi Liu, Zhengyi Yang, JianLin Tong, Jiajia Yang, Jiongcheng Peng, Lihang Zhang, Wangxin Cheng\",\"doi\":\"10.1007/s00138-024-01538-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a \\\\(2.73\\\\%\\\\) average increase for pedestrians and cyclists on KITTI.\",\"PeriodicalId\":51116,\"journal\":{\"name\":\"Machine Vision and Applications\",\"volume\":\"101 1\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Vision and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00138-024-01538-y\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01538-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

点云数据的预处理一直是三维物体检测中的一个重要问题。由于点云数据量大，通常采用体素化方法来表示点云，同时降低数据密度。然而，普通的体素化是从体素中随机选择采样点，由于噪声的影响，往往不能很好地表示局部空间特征。为了保留局部特征，本文提出了一种基于证据理论的优化体素下采样（OVD）方法。该方法使用模糊集为每个候选点的基本概率分布（BPA）建模，并结合了点的位置信息。然后，它利用证据理论来融合 BPA，并确定所选的采样点。在 PointPillars 3D 物体检测算法中，点云被分割成支柱，并使用每个支柱的点进行编码。卷积神经网络用于特征提取和检测。另一个贡献是提出了基于证据理论的改进型 PointPillars（ET-PointPillars），在 PointPillars 的支柱特征网络中引入了基于 OVD 的特征点采样模块，该模块可以使用优化方法选择支柱中的特征点，计算这些点的偏移量，并将其添加为特征，以便学习更多对象特征，从而改进了传统的 PointPillars。在 KITTI 数据集上的实验验证了该方法保留局部空间特征的能力。结果表明，KITTI 数据集上行人和骑自行车者的检测精度提高了，平均提高了（2.73%）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling

查看原文本刊更多论文

ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling

The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a \(2.73\%\) average increase for pedestrians and cyclists on KITTI.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Vision and Applications 工程技术-工程：电子与电气

CiteScore

6.30

自引率

3.00%

发文量

审稿时长

8.7 months

期刊介绍： Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal. Particular emphasis is placed on engineering and technology aspects of image processing and computer vision. The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.