ET-PointPillars:基于优化体素下采样的改进型三维物体检测 PointPillars

IF 2.4 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yiyi Liu, Zhengyi Yang, JianLin Tong, Jiajia Yang, Jiongcheng Peng, Lihang Zhang, Wangxin Cheng
{"title":"ET-PointPillars:基于优化体素下采样的改进型三维物体检测 PointPillars","authors":"Yiyi Liu, Zhengyi Yang, JianLin Tong, Jiajia Yang, Jiongcheng Peng, Lihang Zhang, Wangxin Cheng","doi":"10.1007/s00138-024-01538-y","DOIUrl":null,"url":null,"abstract":"<p>The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a <span>\\(2.73\\%\\)</span> average increase for pedestrians and cyclists on KITTI.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"101 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling\",\"authors\":\"Yiyi Liu, Zhengyi Yang, JianLin Tong, Jiajia Yang, Jiongcheng Peng, Lihang Zhang, Wangxin Cheng\",\"doi\":\"10.1007/s00138-024-01538-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a <span>\\\\(2.73\\\\%\\\\)</span> average increase for pedestrians and cyclists on KITTI.</p>\",\"PeriodicalId\":51116,\"journal\":{\"name\":\"Machine Vision and Applications\",\"volume\":\"101 1\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Vision and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00138-024-01538-y\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01538-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

点云数据的预处理一直是三维物体检测中的一个重要问题。由于点云数据量大,通常采用体素化方法来表示点云,同时降低数据密度。然而,普通的体素化是从体素中随机选择采样点,由于噪声的影响,往往不能很好地表示局部空间特征。为了保留局部特征,本文提出了一种基于证据理论的优化体素下采样(OVD)方法。该方法使用模糊集为每个候选点的基本概率分布(BPA)建模,并结合了点的位置信息。然后,它利用证据理论来融合 BPA,并确定所选的采样点。在 PointPillars 3D 物体检测算法中,点云被分割成支柱,并使用每个支柱的点进行编码。卷积神经网络用于特征提取和检测。另一个贡献是提出了基于证据理论的改进型 PointPillars(ET-PointPillars),在 PointPillars 的支柱特征网络中引入了基于 OVD 的特征点采样模块,该模块可以使用优化方法选择支柱中的特征点,计算这些点的偏移量,并将其添加为特征,以便学习更多对象特征,从而改进了传统的 PointPillars。在 KITTI 数据集上的实验验证了该方法保留局部空间特征的能力。结果表明,KITTI 数据集上行人和骑自行车者的检测精度提高了,平均提高了(2.73%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling

ET-PointPillars: improved PointPillars for 3D object detection based on optimized voxel downsampling

The preprocessing of point cloud data has always been an important problem in 3D object detection. Due to the large volume of point cloud data, voxelization methods are often used to represent the point cloud while reducing data density. However, common voxelization randomly selects sampling points from voxels, which often fails to represent local spatial features well due to noise. To preserve local features, this paper proposes an optimized voxel downsampling(OVD) method based on evidence theory. This method uses fuzzy sets to model basic probability assignments (BPAs) for each candidate point, incorporating point location information. It then employs evidence theory to fuse the BPAs and determine the selected sampling points. In the PointPillars 3D object detection algorithm, the point cloud is partitioned into pillars and encoded using each pillar’s points. Convolutional neural networks are used for feature extraction and detection. Another contribution is the proposed improved PointPillars based on evidence theory (ET-PointPillars) by introducing an OVD-based feature point sampling module in the PointPillars’ pillar feature network, which can select feature points in pillars using the optimized method, computes offsets to these points, and adds them as features to facilitate learning more object characteristics, improving traditional PointPillars. Experiments on the KITTI datasets validate the method’s ability to preserve local spatial features. Results showed improved detection precision, with a \(2.73\%\) average increase for pedestrians and cyclists on KITTI.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Machine Vision and Applications
Machine Vision and Applications 工程技术-工程:电子与电气
CiteScore
6.30
自引率
3.00%
发文量
84
审稿时长
8.7 months
期刊介绍: Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal. Particular emphasis is placed on engineering and technology aspects of image processing and computer vision. The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信