回归:单目3D车辆定位的两个阶段

Jaesung Choe, Kyungdon Joo, François Rameau, Gyumin Shim, I. Kweon
{"title":"回归:单目3D车辆定位的两个阶段","authors":"Jaesung Choe, Kyungdon Joo, François Rameau, Gyumin Shim, I. Kweon","doi":"10.15607/RSS.2019.XV.016","DOIUrl":null,"url":null,"abstract":"High-quality depth information is required to perform 3D vehicle detection, consequently, there exists a large performance gap between camera and LiDAR-based approaches. In this paper, our monocular camera-based 3D vehicle localization method alleviates the dependency on high-quality depth maps by taking advantage of the commonly accepted assumption that the observed vehicles lie on the road surface. We propose a two-stage approach that consists of a segment network and a regression network, called Segment2Regress. For a given single RGB image and a prior 2D object detection bounding box, the two stages are as follows: 1) The segment network activates the pixels under the vehicle (modeled as four line segments and a quadrilateral representing the area beneath the vehicle projected on the image coordinate). These segments are trained to lie on the road plane such that our network does not require full depth estimation. Instead, the depth is directly approximated from the known ground plane parameters. 2) The regression network takes the segments fused with the plane depth to predict the 3D location of a car at the ground level. To stabilize the regression, we introduce a coupling loss that enforces structural constraints. The efficiency, accuracy, and robustness of the proposed technique are highlighted through a series of experiments and ablation assessments. These tests are conducted on the KITTI bird’s eye view dataset where Segment2Regress demonstrates state-of-the-art performance. Further results are available at https://github.com/LifeBeyondExpectations/Segment2Regress","PeriodicalId":307591,"journal":{"name":"Robotics: Science and Systems XV","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Segment2Regress: Monocular 3D Vehicle Localization in Two Stages\",\"authors\":\"Jaesung Choe, Kyungdon Joo, François Rameau, Gyumin Shim, I. Kweon\",\"doi\":\"10.15607/RSS.2019.XV.016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-quality depth information is required to perform 3D vehicle detection, consequently, there exists a large performance gap between camera and LiDAR-based approaches. In this paper, our monocular camera-based 3D vehicle localization method alleviates the dependency on high-quality depth maps by taking advantage of the commonly accepted assumption that the observed vehicles lie on the road surface. We propose a two-stage approach that consists of a segment network and a regression network, called Segment2Regress. For a given single RGB image and a prior 2D object detection bounding box, the two stages are as follows: 1) The segment network activates the pixels under the vehicle (modeled as four line segments and a quadrilateral representing the area beneath the vehicle projected on the image coordinate). These segments are trained to lie on the road plane such that our network does not require full depth estimation. Instead, the depth is directly approximated from the known ground plane parameters. 2) The regression network takes the segments fused with the plane depth to predict the 3D location of a car at the ground level. To stabilize the regression, we introduce a coupling loss that enforces structural constraints. The efficiency, accuracy, and robustness of the proposed technique are highlighted through a series of experiments and ablation assessments. These tests are conducted on the KITTI bird’s eye view dataset where Segment2Regress demonstrates state-of-the-art performance. Further results are available at https://github.com/LifeBeyondExpectations/Segment2Regress\",\"PeriodicalId\":307591,\"journal\":{\"name\":\"Robotics: Science and Systems XV\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics: Science and Systems XV\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15607/RSS.2019.XV.016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics: Science and Systems XV","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15607/RSS.2019.XV.016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

3D车辆检测需要高质量的深度信息,因此,基于相机的方法与基于激光雷达的方法在性能上存在较大差距。在本文中,我们基于单目摄像机的3D车辆定位方法利用了人们普遍接受的假设,即观察到的车辆位于路面上,从而减轻了对高质量深度图的依赖。我们提出了一种两阶段的方法,由一个分段网络和一个回归网络组成,称为Segment2Regress。对于给定的单个RGB图像和先前的2D目标检测边界框,两个阶段如下:1)段网络激活车辆下方的像素(建模为四条线段和一个四边形,表示图像坐标上投影的车辆下方区域)。这些路段被训练成位于道路平面上,这样我们的网络就不需要全深度估计。相反,深度是直接从已知的地平面参数近似。2)回归网络将路段与平面深度融合,预测汽车在地面的三维位置。为了稳定回归,我们引入了一个耦合损失来加强结构约束。通过一系列的实验和消融评估,强调了所提出技术的效率、准确性和鲁棒性。这些测试是在KITTI鸟瞰图数据集上进行的,其中segment2regression展示了最先进的性能。进一步的结果可以在https://github.com/LifeBeyondExpectations/Segment2Regress上找到
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Segment2Regress: Monocular 3D Vehicle Localization in Two Stages
High-quality depth information is required to perform 3D vehicle detection, consequently, there exists a large performance gap between camera and LiDAR-based approaches. In this paper, our monocular camera-based 3D vehicle localization method alleviates the dependency on high-quality depth maps by taking advantage of the commonly accepted assumption that the observed vehicles lie on the road surface. We propose a two-stage approach that consists of a segment network and a regression network, called Segment2Regress. For a given single RGB image and a prior 2D object detection bounding box, the two stages are as follows: 1) The segment network activates the pixels under the vehicle (modeled as four line segments and a quadrilateral representing the area beneath the vehicle projected on the image coordinate). These segments are trained to lie on the road plane such that our network does not require full depth estimation. Instead, the depth is directly approximated from the known ground plane parameters. 2) The regression network takes the segments fused with the plane depth to predict the 3D location of a car at the ground level. To stabilize the regression, we introduce a coupling loss that enforces structural constraints. The efficiency, accuracy, and robustness of the proposed technique are highlighted through a series of experiments and ablation assessments. These tests are conducted on the KITTI bird’s eye view dataset where Segment2Regress demonstrates state-of-the-art performance. Further results are available at https://github.com/LifeBeyondExpectations/Segment2Regress
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信