面向自动驾驶扩展目标跟踪的深度多模态目标检测测量空间表征评价

2020 IEEE 3rd Connected and Automated Vehicles Symposium (CAVS) Pub Date : 2020-11-01 DOI:10.1109/CAVS51000.2020.9334646

Lino Antoni Giefer, Razieh Khamsehashari, K. Schill

{"title":"面向自动驾驶扩展目标跟踪的深度多模态目标检测测量空间表征评价","authors":"Lino Antoni Giefer, Razieh Khamsehashari, K. Schill","doi":"10.1109/CAVS51000.2020.9334646","DOIUrl":null,"url":null,"abstract":"The perception ability of automated systems such as autonomous cars plays an outstanding role for safe and reliable functionality. With the continuously growing accuracy of deep neural networks for object detection on one side and the investigation of appropriate space representations for object tracking on the other side both essential perception parts received special research attention within the last years. However, early fusion of multiple sensors turns the determination of suitable measurement spaces into a complex and not trivial task. In this paper, we propose the use of a deep multi-modal object detection network for the early fusion of LiDAR and camera data to serve as a measurement source for an extended object tracking algorithm on Lie groups. We develop an extended Kalman filter and model the state space as the direct product Aff(2) × ℝ6 incorporating second- and third-order dynamics. We compare the tracking performance of different measurement space representations-SO(2) × ℝ4, SO(2)2 × ℝ3 and Aff(2)-to evaluate, how our object detection network encapsulates the measurement parameters and the associated uncertainties. With our results, we show that the lowest tracking errors in the case of single object tracking are obtained by representing the measurement space by the affine group. Thus, we assume that our proposed object detection network captures the intrinsic relationships between the measurement parameters, especially between position and orientation.","PeriodicalId":409507,"journal":{"name":"2020 IEEE 3rd Connected and Automated Vehicles Symposium (CAVS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Evaluation of Measurement Space Representations of Deep Multi-Modal Object Detection for Extended Object Tracking in Autonomous Driving\",\"authors\":\"Lino Antoni Giefer, Razieh Khamsehashari, K. Schill\",\"doi\":\"10.1109/CAVS51000.2020.9334646\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The perception ability of automated systems such as autonomous cars plays an outstanding role for safe and reliable functionality. With the continuously growing accuracy of deep neural networks for object detection on one side and the investigation of appropriate space representations for object tracking on the other side both essential perception parts received special research attention within the last years. However, early fusion of multiple sensors turns the determination of suitable measurement spaces into a complex and not trivial task. In this paper, we propose the use of a deep multi-modal object detection network for the early fusion of LiDAR and camera data to serve as a measurement source for an extended object tracking algorithm on Lie groups. We develop an extended Kalman filter and model the state space as the direct product Aff(2) × ℝ6 incorporating second- and third-order dynamics. We compare the tracking performance of different measurement space representations-SO(2) × ℝ4, SO(2)2 × ℝ3 and Aff(2)-to evaluate, how our object detection network encapsulates the measurement parameters and the associated uncertainties. With our results, we show that the lowest tracking errors in the case of single object tracking are obtained by representing the measurement space by the affine group. Thus, we assume that our proposed object detection network captures the intrinsic relationships between the measurement parameters, especially between position and orientation.\",\"PeriodicalId\":409507,\"journal\":{\"name\":\"2020 IEEE 3rd Connected and Automated Vehicles Symposium (CAVS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 3rd Connected and Automated Vehicles Symposium (CAVS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAVS51000.2020.9334646\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 3rd Connected and Automated Vehicles Symposium (CAVS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAVS51000.2020.9334646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

自动驾驶汽车等自动化系统的感知能力对安全可靠的功能起着突出的作用。近年来，随着深度神经网络对目标检测精度的不断提高，以及对目标跟踪的适当空间表示的研究，这两个重要的感知部分都受到了特别的研究关注。然而，早期的多传感器融合使得确定合适的测量空间成为一项复杂而不简单的任务。在本文中，我们提出使用深度多模态目标检测网络进行激光雷达和相机数据的早期融合，作为李群上扩展目标跟踪算法的测量源。我们开发了一个扩展的卡尔曼滤波器，并将状态空间建模为包含二阶和三阶动力学的直接积Aff(2) ×∈6。我们比较了不同的测量空间表示(SO(2) ×∈4,SO(2)2 ×∈3和Aff(2))的跟踪性能，以评估我们的目标检测网络如何封装测量参数和相关的不确定性。结果表明，在单目标跟踪情况下，用仿射群表示测量空间可以获得最小的跟踪误差。因此，我们假设我们提出的目标检测网络捕获了测量参数之间的内在关系，特别是位置和方向之间的内在关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluation of Measurement Space Representations of Deep Multi-Modal Object Detection for Extended Object Tracking in Autonomous Driving

The perception ability of automated systems such as autonomous cars plays an outstanding role for safe and reliable functionality. With the continuously growing accuracy of deep neural networks for object detection on one side and the investigation of appropriate space representations for object tracking on the other side both essential perception parts received special research attention within the last years. However, early fusion of multiple sensors turns the determination of suitable measurement spaces into a complex and not trivial task. In this paper, we propose the use of a deep multi-modal object detection network for the early fusion of LiDAR and camera data to serve as a measurement source for an extended object tracking algorithm on Lie groups. We develop an extended Kalman filter and model the state space as the direct product Aff(2) × ℝ6 incorporating second- and third-order dynamics. We compare the tracking performance of different measurement space representations-SO(2) × ℝ4, SO(2)2 × ℝ3 and Aff(2)-to evaluate, how our object detection network encapsulates the measurement parameters and the associated uncertainties. With our results, we show that the lowest tracking errors in the case of single object tracking are obtained by representing the measurement space by the affine group. Thus, we assume that our proposed object detection network captures the intrinsic relationships between the measurement parameters, especially between position and orientation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 3rd Connected and Automated Vehicles Symposium (CAVS)

自引率

0.00%

发文量