基于高度感知场景重构的准确、鲁棒性路边三维物体检测

IF 4.3 2区综合性期刊 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Sensors Journal Pub Date : 2024-08-22 DOI:10.1109/JSEN.2024.3444816

Yaqing Chen;Huaming Wang

{"title":"基于高度感知场景重构的准确、鲁棒性路边三维物体检测","authors":"Yaqing Chen;Huaming Wang","doi":"10.1109/JSEN.2024.3444816","DOIUrl":null,"url":null,"abstract":"Roadside 3-D object detection allows for a drastic expansion of the visibility range and a reduction in occlusions for autonomous vehicles. Recent approaches are based on bird’s-eye view (BEV) fusion, which unifies multimodal features in the shared BEV representation space. However, the camera-to-BEV projection throws away the geometric information of camera features, hindering the effectiveness of such methods. Besides, depth-based camera lifting results in inefficiency and instability in disrupted roadside scenarios. To address these challenges, this article introduces a novel 3-D object detection framework based on height-aware scene reconstruction, dubbed HSRDet. Specifically, we leverage height-aware 3-D reconstruction to ensure geometric consistency in BEV feature mapping and employ a fast camera-to-BEV transformation based on feature distillation to boost efficiency without compromising performance. In addition, we integrate a novel data augmentation method, namely, View Shake (VS), to further improve the performance of our model. Extensive experiments on the DAIR-V2X dataset demonstrate that HSRDet not only achieves state-of-the-art detection accuracy but also exhibits strong robustness in disturbance scenarios. Further experiments on the intelligent roadside units (RSUs) have revealed that our method runs stably at 11.8 frames/s on a RTX 3090 Ti GPU, thus promising vast engineering application prospects.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"24 19","pages":"30643-30653"},"PeriodicalIF":4.3000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accurate and Robust Roadside 3-D Object Detection Based on Height-Aware Scene Reconstruction\",\"authors\":\"Yaqing Chen;Huaming Wang\",\"doi\":\"10.1109/JSEN.2024.3444816\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Roadside 3-D object detection allows for a drastic expansion of the visibility range and a reduction in occlusions for autonomous vehicles. Recent approaches are based on bird’s-eye view (BEV) fusion, which unifies multimodal features in the shared BEV representation space. However, the camera-to-BEV projection throws away the geometric information of camera features, hindering the effectiveness of such methods. Besides, depth-based camera lifting results in inefficiency and instability in disrupted roadside scenarios. To address these challenges, this article introduces a novel 3-D object detection framework based on height-aware scene reconstruction, dubbed HSRDet. Specifically, we leverage height-aware 3-D reconstruction to ensure geometric consistency in BEV feature mapping and employ a fast camera-to-BEV transformation based on feature distillation to boost efficiency without compromising performance. In addition, we integrate a novel data augmentation method, namely, View Shake (VS), to further improve the performance of our model. Extensive experiments on the DAIR-V2X dataset demonstrate that HSRDet not only achieves state-of-the-art detection accuracy but also exhibits strong robustness in disturbance scenarios. Further experiments on the intelligent roadside units (RSUs) have revealed that our method runs stably at 11.8 frames/s on a RTX 3090 Ti GPU, thus promising vast engineering application prospects.\",\"PeriodicalId\":447,\"journal\":{\"name\":\"IEEE Sensors Journal\",\"volume\":\"24 19\",\"pages\":\"30643-30653\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Sensors Journal\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10643791/\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10643791/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

路边三维物体检测可大幅扩大自动驾驶车辆的可见范围并减少遮挡物。最近的方法是基于鸟瞰图（BEV）融合，在共享的鸟瞰图表示空间中统一多模态特征。然而，摄像头到 BEV 的投影会丢失摄像头特征的几何信息，从而影响此类方法的有效性。此外，基于深度的摄像头提升会导致效率低下，并且在路边的混乱场景中不稳定。为了应对这些挑战，本文介绍了一种基于高度感知场景重构的新型三维物体检测框架，称为 HSRDet。具体来说，我们利用高度感知三维重建来确保 BEV 特征映射的几何一致性，并采用基于特征提炼的快速摄像头到 BEV 变换来提高效率，同时不影响性能。此外，我们还集成了一种新颖的数据增强方法，即视图抖动（VS），以进一步提高模型的性能。在 DAIR-V2X 数据集上进行的大量实验表明，HSRDet 不仅达到了最先进的检测精度，而且在干扰场景下表现出很强的鲁棒性。在智能路边装置（RSU）上的进一步实验表明，我们的方法可以在 RTX 3090 Ti GPU 上以 11.8 帧/秒的速度稳定运行，因此具有广阔的工程应用前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accurate and Robust Roadside 3-D Object Detection Based on Height-Aware Scene Reconstruction

Roadside 3-D object detection allows for a drastic expansion of the visibility range and a reduction in occlusions for autonomous vehicles. Recent approaches are based on bird’s-eye view (BEV) fusion, which unifies multimodal features in the shared BEV representation space. However, the camera-to-BEV projection throws away the geometric information of camera features, hindering the effectiveness of such methods. Besides, depth-based camera lifting results in inefficiency and instability in disrupted roadside scenarios. To address these challenges, this article introduces a novel 3-D object detection framework based on height-aware scene reconstruction, dubbed HSRDet. Specifically, we leverage height-aware 3-D reconstruction to ensure geometric consistency in BEV feature mapping and employ a fast camera-to-BEV transformation based on feature distillation to boost efficiency without compromising performance. In addition, we integrate a novel data augmentation method, namely, View Shake (VS), to further improve the performance of our model. Extensive experiments on the DAIR-V2X dataset demonstrate that HSRDet not only achieves state-of-the-art detection accuracy but also exhibits strong robustness in disturbance scenarios. Further experiments on the intelligent roadside units (RSUs) have revealed that our method runs stably at 11.8 frames/s on a RTX 3090 Ti GPU, thus promising vast engineering application prospects.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Sensors Journal 工程技术-工程：电子与电气

CiteScore

7.70

自引率

14.00%

发文量

2058

审稿时长

5.2 months

期刊介绍： The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following: -Sensor Phenomenology, Modelling, and Evaluation -Sensor Materials, Processing, and Fabrication -Chemical and Gas Sensors -Microfluidics and Biosensors -Optical Sensors -Physical Sensors: Temperature, Mechanical, Magnetic, and others -Acoustic and Ultrasonic Sensors -Sensor Packaging -Sensor Networks -Sensor Applications -Sensor Systems: Signals, Processing, and Interfaces -Actuators and Sensor Power Systems -Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting -Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data) -Sensors in Industrial Practice