{"title":"RopeBEV: A Multi-Camera Roadside Perception Network in Bird's-Eye-View","authors":"Jinrang Jia, Guangqi Yi, Yifeng Shi","doi":"arxiv-2409.11706","DOIUrl":null,"url":null,"abstract":"Multi-camera perception methods in Bird's-Eye-View (BEV) have gained wide\napplication in autonomous driving. However, due to the differences between\nroadside and vehicle-side scenarios, there currently lacks a multi-camera BEV\nsolution in roadside. This paper systematically analyzes the key challenges in\nmulti-camera BEV perception for roadside scenarios compared to vehicle-side.\nThese challenges include the diversity in camera poses, the uncertainty in\nCamera numbers, the sparsity in perception regions, and the ambiguity in\norientation angles. In response, we introduce RopeBEV, the first dense\nmulti-camera BEV approach. RopeBEV introduces BEV augmentation to address the\ntraining balance issues caused by diverse camera poses. By incorporating\nCamMask and ROIMask (Region of Interest Mask), it supports variable camera\nnumbers and sparse perception, respectively. Finally, camera rotation embedding\nis utilized to resolve orientation ambiguity. Our method ranks 1st on the\nreal-world highway dataset RoScenes and demonstrates its practical value on a\nprivate urban dataset that covers more than 50 intersections and 600 cameras.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-camera perception methods in Bird's-Eye-View (BEV) have gained wide
application in autonomous driving. However, due to the differences between
roadside and vehicle-side scenarios, there currently lacks a multi-camera BEV
solution in roadside. This paper systematically analyzes the key challenges in
multi-camera BEV perception for roadside scenarios compared to vehicle-side.
These challenges include the diversity in camera poses, the uncertainty in
Camera numbers, the sparsity in perception regions, and the ambiguity in
orientation angles. In response, we introduce RopeBEV, the first dense
multi-camera BEV approach. RopeBEV introduces BEV augmentation to address the
training balance issues caused by diverse camera poses. By incorporating
CamMask and ROIMask (Region of Interest Mask), it supports variable camera
numbers and sparse perception, respectively. Finally, camera rotation embedding
is utilized to resolve orientation ambiguity. Our method ranks 1st on the
real-world highway dataset RoScenes and demonstrates its practical value on a
private urban dataset that covers more than 50 intersections and 600 cameras.