{"title":"基于高度感知场景重构的准确、鲁棒性路边三维物体检测","authors":"Yaqing Chen;Huaming Wang","doi":"10.1109/JSEN.2024.3444816","DOIUrl":null,"url":null,"abstract":"Roadside 3-D object detection allows for a drastic expansion of the visibility range and a reduction in occlusions for autonomous vehicles. Recent approaches are based on bird’s-eye view (BEV) fusion, which unifies multimodal features in the shared BEV representation space. However, the camera-to-BEV projection throws away the geometric information of camera features, hindering the effectiveness of such methods. Besides, depth-based camera lifting results in inefficiency and instability in disrupted roadside scenarios. To address these challenges, this article introduces a novel 3-D object detection framework based on height-aware scene reconstruction, dubbed HSRDet. Specifically, we leverage height-aware 3-D reconstruction to ensure geometric consistency in BEV feature mapping and employ a fast camera-to-BEV transformation based on feature distillation to boost efficiency without compromising performance. In addition, we integrate a novel data augmentation method, namely, View Shake (VS), to further improve the performance of our model. Extensive experiments on the DAIR-V2X dataset demonstrate that HSRDet not only achieves state-of-the-art detection accuracy but also exhibits strong robustness in disturbance scenarios. Further experiments on the intelligent roadside units (RSUs) have revealed that our method runs stably at 11.8 frames/s on a RTX 3090 Ti GPU, thus promising vast engineering application prospects.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"24 19","pages":"30643-30653"},"PeriodicalIF":4.3000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accurate and Robust Roadside 3-D Object Detection Based on Height-Aware Scene Reconstruction\",\"authors\":\"Yaqing Chen;Huaming Wang\",\"doi\":\"10.1109/JSEN.2024.3444816\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Roadside 3-D object detection allows for a drastic expansion of the visibility range and a reduction in occlusions for autonomous vehicles. Recent approaches are based on bird’s-eye view (BEV) fusion, which unifies multimodal features in the shared BEV representation space. However, the camera-to-BEV projection throws away the geometric information of camera features, hindering the effectiveness of such methods. Besides, depth-based camera lifting results in inefficiency and instability in disrupted roadside scenarios. To address these challenges, this article introduces a novel 3-D object detection framework based on height-aware scene reconstruction, dubbed HSRDet. Specifically, we leverage height-aware 3-D reconstruction to ensure geometric consistency in BEV feature mapping and employ a fast camera-to-BEV transformation based on feature distillation to boost efficiency without compromising performance. In addition, we integrate a novel data augmentation method, namely, View Shake (VS), to further improve the performance of our model. Extensive experiments on the DAIR-V2X dataset demonstrate that HSRDet not only achieves state-of-the-art detection accuracy but also exhibits strong robustness in disturbance scenarios. Further experiments on the intelligent roadside units (RSUs) have revealed that our method runs stably at 11.8 frames/s on a RTX 3090 Ti GPU, thus promising vast engineering application prospects.\",\"PeriodicalId\":447,\"journal\":{\"name\":\"IEEE Sensors Journal\",\"volume\":\"24 19\",\"pages\":\"30643-30653\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Sensors Journal\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10643791/\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10643791/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Accurate and Robust Roadside 3-D Object Detection Based on Height-Aware Scene Reconstruction
Roadside 3-D object detection allows for a drastic expansion of the visibility range and a reduction in occlusions for autonomous vehicles. Recent approaches are based on bird’s-eye view (BEV) fusion, which unifies multimodal features in the shared BEV representation space. However, the camera-to-BEV projection throws away the geometric information of camera features, hindering the effectiveness of such methods. Besides, depth-based camera lifting results in inefficiency and instability in disrupted roadside scenarios. To address these challenges, this article introduces a novel 3-D object detection framework based on height-aware scene reconstruction, dubbed HSRDet. Specifically, we leverage height-aware 3-D reconstruction to ensure geometric consistency in BEV feature mapping and employ a fast camera-to-BEV transformation based on feature distillation to boost efficiency without compromising performance. In addition, we integrate a novel data augmentation method, namely, View Shake (VS), to further improve the performance of our model. Extensive experiments on the DAIR-V2X dataset demonstrate that HSRDet not only achieves state-of-the-art detection accuracy but also exhibits strong robustness in disturbance scenarios. Further experiments on the intelligent roadside units (RSUs) have revealed that our method runs stably at 11.8 frames/s on a RTX 3090 Ti GPU, thus promising vast engineering application prospects.
期刊介绍:
The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following:
-Sensor Phenomenology, Modelling, and Evaluation
-Sensor Materials, Processing, and Fabrication
-Chemical and Gas Sensors
-Microfluidics and Biosensors
-Optical Sensors
-Physical Sensors: Temperature, Mechanical, Magnetic, and others
-Acoustic and Ultrasonic Sensors
-Sensor Packaging
-Sensor Networks
-Sensor Applications
-Sensor Systems: Signals, Processing, and Interfaces
-Actuators and Sensor Power Systems
-Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting
-Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data)
-Sensors in Industrial Practice