{"title":"FSD V2: Improving Fully Sparse 3D Object Detection With Virtual Voxels","authors":"Lue Fan;Feng Wang;Naiyan Wang;Zhaoxiang Zhang","doi":"10.1109/TPAMI.2024.3502456","DOIUrl":null,"url":null,"abstract":"LiDAR-based fully sparse architecture has gained increasing attention. FSDv1 stands out as a representative work, achieving impressive efficacy and efficiency, albeit with intricate structures and handcrafted designs. In this paper, we present FSDv2, an evolution that aims to simplify the previous FSDv1 and eliminate the ad-hoc heuristics in its handcrafted instance-level representation, thus promoting better universality. To this end, we introduce \n<italic>virtual voxels</i>\n, taking over the clustering-based instance segmentation in FSDv1. Virtual voxels not only address the notorious issue of the Center Feature Missing in fully sparse detectors but also endow the framework with a more elegant and streamlined approach. Besides, we develop a suite of components to complement the virtual voxel mechanism, including a virtual voxel encoder, a virtual voxel mixer, and a virtual voxel assignment strategy. We conduct experiments on three large-scale datasets: \n<italic>Waymo Open Dataset</i>\n, \n<italic>Argoverse 2</i>\n dataset, and \n<italic>nuScenes</i>\n dataset. Our results showcase state-of-the-art performance on all three datasets, highlighting the superiority of FSDv2 in long-range scenarios and its universality in achieving competitive performance across diverse scenarios. Moreover, we provide comprehensive experimental analysis to understand the workings of FSDv2.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 2","pages":"1279-1292"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10758248/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
LiDAR-based fully sparse architecture has gained increasing attention. FSDv1 stands out as a representative work, achieving impressive efficacy and efficiency, albeit with intricate structures and handcrafted designs. In this paper, we present FSDv2, an evolution that aims to simplify the previous FSDv1 and eliminate the ad-hoc heuristics in its handcrafted instance-level representation, thus promoting better universality. To this end, we introduce
virtual voxels
, taking over the clustering-based instance segmentation in FSDv1. Virtual voxels not only address the notorious issue of the Center Feature Missing in fully sparse detectors but also endow the framework with a more elegant and streamlined approach. Besides, we develop a suite of components to complement the virtual voxel mechanism, including a virtual voxel encoder, a virtual voxel mixer, and a virtual voxel assignment strategy. We conduct experiments on three large-scale datasets:
Waymo Open Dataset
,
Argoverse 2
dataset, and
nuScenes
dataset. Our results showcase state-of-the-art performance on all three datasets, highlighting the superiority of FSDv2 in long-range scenarios and its universality in achieving competitive performance across diverse scenarios. Moreover, we provide comprehensive experimental analysis to understand the workings of FSDv2.