LVP：在多模态早期融合中利用虚拟点进行 3D 物体检测

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-12-17 DOI:10.1109/TGRS.2024.3519386

Yidong Chen;Guorong Cai;Ziying Song;Zhaoliang Liu;Binghui Zeng;Jonathan Li;Zongyue Wang

{"title":"LVP：在多模态早期融合中利用虚拟点进行 3D 物体检测","authors":"Yidong Chen;Guorong Cai;Ziying Song;Zhaoliang Liu;Binghui Zeng;Jonathan Li;Zongyue Wang","doi":"10.1109/TGRS.2024.3519386","DOIUrl":null,"url":null,"abstract":"Due to the sparsity and occlusion of point clouds, pure point cloud detection has limited effectiveness in detecting such samples. Researchers have been actively exploring the fusion of multimodal data, attempting to address the bottleneck issue based on LiDAR. In particular, virtual points, generated through depth completion from front-view RGB image, offer the potential for better integration with point clouds. Nevertheless, recent approaches fuse these two modalities in the region of interest (RoI), which limits the fusion effectiveness due to the inaccurate RoI region issue in the point cloud’s branch, especially in hard samples. To overcome it and unleash the potential of virtual points, while combining late fusion, we present leverage virtual point (LVP), a high-performance 3-D object detector which LVPs in early fusion to enhance the quality of RoI generation. LVP consists of three early fusion modules: virtual points painting (VPP), virtual points auxiliary (VPA), and virtual points completion (VPC) to achieve point-level fusion and global-level fusion. The integration of these modules effectively improves occlusion handling and improves the detection of distant small objects. In the KITTI benchmark, LVP achieves 85.45% 3-D mAP. As for large dataset nuScenes, we could improve the detection accuracy of large objects by compensating for errors in depth estimation. Without whistles and bells, these results establish LVP as an impressive solution for a 3-D outdoor object detection algorithm.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-15"},"PeriodicalIF":8.6000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LVP: Leverage Virtual Points in Multimodal Early Fusion for 3-D Object Detection\",\"authors\":\"Yidong Chen;Guorong Cai;Ziying Song;Zhaoliang Liu;Binghui Zeng;Jonathan Li;Zongyue Wang\",\"doi\":\"10.1109/TGRS.2024.3519386\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the sparsity and occlusion of point clouds, pure point cloud detection has limited effectiveness in detecting such samples. Researchers have been actively exploring the fusion of multimodal data, attempting to address the bottleneck issue based on LiDAR. In particular, virtual points, generated through depth completion from front-view RGB image, offer the potential for better integration with point clouds. Nevertheless, recent approaches fuse these two modalities in the region of interest (RoI), which limits the fusion effectiveness due to the inaccurate RoI region issue in the point cloud’s branch, especially in hard samples. To overcome it and unleash the potential of virtual points, while combining late fusion, we present leverage virtual point (LVP), a high-performance 3-D object detector which LVPs in early fusion to enhance the quality of RoI generation. LVP consists of three early fusion modules: virtual points painting (VPP), virtual points auxiliary (VPA), and virtual points completion (VPC) to achieve point-level fusion and global-level fusion. The integration of these modules effectively improves occlusion handling and improves the detection of distant small objects. In the KITTI benchmark, LVP achieves 85.45% 3-D mAP. As for large dataset nuScenes, we could improve the detection accuracy of large objects by compensating for errors in depth estimation. Without whistles and bells, these results establish LVP as an impressive solution for a 3-D outdoor object detection algorithm.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"63 \",\"pages\":\"1-15\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2024-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10804692/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10804692/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

由于点云的稀疏性和遮挡性，单纯的点云检测在检测此类样本时效果有限。研究人员一直在积极探索多模态数据的融合，试图解决基于激光雷达的瓶颈问题。特别是，通过前视RGB图像的深度补全生成的虚拟点，提供了更好地与点云集成的潜力。然而，最近的方法将这两种模式融合在感兴趣区域（RoI）中，由于点云分支中RoI区域不准确的问题，特别是在硬样本中，这限制了融合的有效性。为了克服这一问题，释放虚拟点的潜力，在结合后期融合的同时，我们提出了一种利用虚拟点（LVP）的高性能三维目标检测器，它在早期融合中使用虚拟点（LVP）来提高RoI生成的质量。LVP由虚拟点绘制（VPP）、虚拟点辅助（VPA）和虚拟点完成（VPC）三个早期融合模块组成，实现点级融合和全局级融合。这些模块的集成有效地改善了遮挡处理，提高了对远距离小物体的检测。在KITTI基准测试中，LVP达到85.45%的3d mAP。对于大型数据集nuScenes，我们可以通过补偿深度估计误差来提高大型目标的检测精度。没有哨子和铃铛，这些结果证明LVP是一个令人印象深刻的3-D室外物体检测算法解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

LVP: Leverage Virtual Points in Multimodal Early Fusion for 3-D Object Detection

Due to the sparsity and occlusion of point clouds, pure point cloud detection has limited effectiveness in detecting such samples. Researchers have been actively exploring the fusion of multimodal data, attempting to address the bottleneck issue based on LiDAR. In particular, virtual points, generated through depth completion from front-view RGB image, offer the potential for better integration with point clouds. Nevertheless, recent approaches fuse these two modalities in the region of interest (RoI), which limits the fusion effectiveness due to the inaccurate RoI region issue in the point cloud’s branch, especially in hard samples. To overcome it and unleash the potential of virtual points, while combining late fusion, we present leverage virtual point (LVP), a high-performance 3-D object detector which LVPs in early fusion to enhance the quality of RoI generation. LVP consists of three early fusion modules: virtual points painting (VPP), virtual points auxiliary (VPA), and virtual points completion (VPC) to achieve point-level fusion and global-level fusion. The integration of these modules effectively improves occlusion handling and improves the detection of distant small objects. In the KITTI benchmark, LVP achieves 85.45% 3-D mAP. As for large dataset nuScenes, we could improve the detection accuracy of large objects by compensating for errors in depth estimation. Without whistles and bells, these results establish LVP as an impressive solution for a 3-D outdoor object detection algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.

LVP： 在多模态早期融合中利用虚拟点进行 3D 物体检测

摘要

LVP：在多模态早期融合中利用虚拟点进行 3D 物体检测