Xinyu Sun , Lisheng Jin , Huanhuan Wang , Zhen Huo , Yang He , Guangqi Wang
{"title":"Spatial awareness enhancement based single-stage anchor-free 3D object detection for autonomous driving","authors":"Xinyu Sun , Lisheng Jin , Huanhuan Wang , Zhen Huo , Yang He , Guangqi Wang","doi":"10.1016/j.displa.2024.102821","DOIUrl":null,"url":null,"abstract":"<div><p>The real-time and accurate detection of three-dimensional (3D) objects based on LiDAR is a focal problem in the field of autonomous driving environment perception. Compared to two-stage and anchor-based 3D object detection methods that suffer from inference latency challenges, single-stage anchor-free 3D object detection approaches are more suitable for deployment in autonomous driving vehicles with the strict real-time requirement. However, they face the issue of insufficient spatial awareness, which can result in detection errors such as false positives and false negatives, thereby increasing the potential risks of autonomous driving. In response to this, we focus on enhancing the spatial awareness of CenterPoint, a widely used single-stage anchor-free 3D object detector in the industry. Considering the limited allocation of computational resources and the performance bottleneck caused by pillar encoder, we propose an efficient SSDCM backbone to strengthen feature representation and extraction. Furthermore, a simple BGC neck is devised to weight and exchange contextual information in order to deeply fuse multi-scale features. Combining improved backbone and neck networks, we construct a single-stage anchor-free 3D object detection model with spatial awareness enhancement, named CenterPoint-Spatial Awareness Enhancement (CenterPoint-SAE). We evaluate CenterPoint-SAE on two large-scale and challenging autonomous driving datasets, nuScenes and Waymo. It achieves 53.3% mAP and 62.5% NDS on nuScenes detection benchmark, and runs inference at a speed of 11.1 FPS. Compared to the baseline, the upgraded networks deliver a performance improvement of 1.6% mAP and 1.2% NDS at minor cost. Notably, on Waymo dataset, our method achieves competitive detection performance compared to two-stage and point-based methods.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102821"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001859","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The real-time and accurate detection of three-dimensional (3D) objects based on LiDAR is a focal problem in the field of autonomous driving environment perception. Compared to two-stage and anchor-based 3D object detection methods that suffer from inference latency challenges, single-stage anchor-free 3D object detection approaches are more suitable for deployment in autonomous driving vehicles with the strict real-time requirement. However, they face the issue of insufficient spatial awareness, which can result in detection errors such as false positives and false negatives, thereby increasing the potential risks of autonomous driving. In response to this, we focus on enhancing the spatial awareness of CenterPoint, a widely used single-stage anchor-free 3D object detector in the industry. Considering the limited allocation of computational resources and the performance bottleneck caused by pillar encoder, we propose an efficient SSDCM backbone to strengthen feature representation and extraction. Furthermore, a simple BGC neck is devised to weight and exchange contextual information in order to deeply fuse multi-scale features. Combining improved backbone and neck networks, we construct a single-stage anchor-free 3D object detection model with spatial awareness enhancement, named CenterPoint-Spatial Awareness Enhancement (CenterPoint-SAE). We evaluate CenterPoint-SAE on two large-scale and challenging autonomous driving datasets, nuScenes and Waymo. It achieves 53.3% mAP and 62.5% NDS on nuScenes detection benchmark, and runs inference at a speed of 11.1 FPS. Compared to the baseline, the upgraded networks deliver a performance improvement of 1.6% mAP and 1.2% NDS at minor cost. Notably, on Waymo dataset, our method achieves competitive detection performance compared to two-stage and point-based methods.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.