CFTrack:用于3D多目标跟踪的基于中心的雷达和相机融合

2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops) Pub Date : 2021-07-11 DOI:10.1109/ivworkshops54471.2021.9669223

Ramin Nabati, Landon Harris, H. Qi

{"title":"CFTrack:用于3D多目标跟踪的基于中心的雷达和相机融合","authors":"Ramin Nabati, Landon Harris, H. Qi","doi":"10.1109/ivworkshops54471.2021.9669223","DOIUrl":null,"url":null,"abstract":"3D multi-object tracking is a crucial component in the perception system of autonomous driving vehicles. Tracking all dynamic objects around the vehicle is essential for tasks such as obstacle avoidance and path planning. Autonomous vehicles are usually equipped with different sensor modalities to improve accuracy and reliability. While sensor fusion has been widely used in object detection networks in recent years, most existing multi-object tracking algorithms either rely on a single input modality, or do not fully exploit the information provided by multiple sensing modalities. In this work, we propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion. Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association. The proposed greedy algorithm uses the depth, velocity and 2D displacement of the detected objects to associate them through time. This makes our tracking algorithm very robust to occluded and overlapping objects, as the depth and velocity information can help the network in distinguishing them. We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark, as well as the baseline LiDAR-based method. Our method is online with a runtime of 35ms per image, making it very suitable for autonomous driving applications.","PeriodicalId":256905,"journal":{"name":"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object Tracking\",\"authors\":\"Ramin Nabati, Landon Harris, H. Qi\",\"doi\":\"10.1109/ivworkshops54471.2021.9669223\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"3D multi-object tracking is a crucial component in the perception system of autonomous driving vehicles. Tracking all dynamic objects around the vehicle is essential for tasks such as obstacle avoidance and path planning. Autonomous vehicles are usually equipped with different sensor modalities to improve accuracy and reliability. While sensor fusion has been widely used in object detection networks in recent years, most existing multi-object tracking algorithms either rely on a single input modality, or do not fully exploit the information provided by multiple sensing modalities. In this work, we propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion. Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association. The proposed greedy algorithm uses the depth, velocity and 2D displacement of the detected objects to associate them through time. This makes our tracking algorithm very robust to occluded and overlapping objects, as the depth and velocity information can help the network in distinguishing them. We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark, as well as the baseline LiDAR-based method. Our method is online with a runtime of 35ms per image, making it very suitable for autonomous driving applications.\",\"PeriodicalId\":256905,\"journal\":{\"name\":\"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ivworkshops54471.2021.9669223\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ivworkshops54471.2021.9669223","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

三维多目标跟踪是自动驾驶车辆感知系统的重要组成部分。跟踪车辆周围的所有动态物体对于避障和路径规划等任务至关重要。自动驾驶汽车通常配备不同的传感器模式，以提高准确性和可靠性。近年来，传感器融合在目标检测网络中得到了广泛的应用，但现有的多目标跟踪算法要么依赖于单一的输入模式，要么不能充分利用多种传感模式提供的信息。在这项工作中，我们提出了一种基于雷达和相机传感器融合的端到端联合目标检测和跟踪网络。我们提出的方法使用基于中心的雷达-相机融合算法进行目标检测，并使用贪婪算法进行目标关联。提出的贪心算法利用被检测物体的深度、速度和二维位移随时间进行关联。这使得我们的跟踪算法对遮挡和重叠的物体具有很强的鲁棒性，因为深度和速度信息可以帮助网络区分它们。我们在具有挑战性的nuScenes数据集上评估了我们的方法，其中它达到了20.0 AMOTA，并且在基准测试中优于所有基于视觉的3D跟踪方法，以及基于lidar的基线方法。我们的方法是在线的，每张图像的运行时间为35ms，非常适合自动驾驶应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object Tracking

3D multi-object tracking is a crucial component in the perception system of autonomous driving vehicles. Tracking all dynamic objects around the vehicle is essential for tasks such as obstacle avoidance and path planning. Autonomous vehicles are usually equipped with different sensor modalities to improve accuracy and reliability. While sensor fusion has been widely used in object detection networks in recent years, most existing multi-object tracking algorithms either rely on a single input modality, or do not fully exploit the information provided by multiple sensing modalities. In this work, we propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion. Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association. The proposed greedy algorithm uses the depth, velocity and 2D displacement of the detected objects to associate them through time. This makes our tracking algorithm very robust to occluded and overlapping objects, as the depth and velocity information can help the network in distinguishing them. We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark, as well as the baseline LiDAR-based method. Our method is online with a runtime of 35ms per image, making it very suitable for autonomous driving applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)

自引率

0.00%

发文量