{"title":"GARD: A Geometry-Informed and Uncertainty-Aware Baseline Method for Zero-Shot Roadside Monocular Object Detection","authors":"Yuru Peng;Beibei Wang;Zijian Yu;Lu Zhang;Jianmin Ji;Yu Zhang;Yanyong Zhang","doi":"10.1109/LRA.2024.3520923","DOIUrl":null,"url":null,"abstract":"Roadside camera-based perception methods are in high demand for developing efficient vehicle-infrastructure collaborative perception systems. By focusing on object-level depth prediction, we explore the potential benefits of integrating environmental priors into such systems and propose a geometry-based roadside per-object depth estimation algorithm dubbed GARD. The proposed method capitalizes on the inherent geometric properties of the pinhole camera model to derive depth as well as 3D positions for given 2D targets in roadside-view images, alleviating the need for computationally intensive end-to-end learning architectures for monocular 3D detection. Using only a pre-trained 2D detection model, our approach does not require vast amounts of scene-specific training data and shows superior generalization abilities across varying environments and camera setups, making it a practical and cost-effective solution for monocular 3D object detection.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 2","pages":"1297-1304"},"PeriodicalIF":4.6000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10812033/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Roadside camera-based perception methods are in high demand for developing efficient vehicle-infrastructure collaborative perception systems. By focusing on object-level depth prediction, we explore the potential benefits of integrating environmental priors into such systems and propose a geometry-based roadside per-object depth estimation algorithm dubbed GARD. The proposed method capitalizes on the inherent geometric properties of the pinhole camera model to derive depth as well as 3D positions for given 2D targets in roadside-view images, alleviating the need for computationally intensive end-to-end learning architectures for monocular 3D detection. Using only a pre-trained 2D detection model, our approach does not require vast amounts of scene-specific training data and shows superior generalization abilities across varying environments and camera setups, making it a practical and cost-effective solution for monocular 3D object detection.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.