CLAMOT: 3D Detection and Tracking via Multi-modal Feature Aggregation

Proceedings of the 4th International Conference on Image Processing and Machine Vision Pub Date : 2022-03-25 DOI:10.1145/3529446.3529451

Shuo Zhang, Xiaolong Liu, Wenqi Tao

引用次数: 0

Abstract

In autonomous driving, multi-object tracking (MOT) can help vehicles perceive surroundings better and perform well-informed motion-planning. Methods based on LiDAR suffer from the sparsity of LiDAR points and detect only in a limited range. To this end, we propose a camera and LiDAR aggregation module named CLA-fusion to fuse the two modal features in a point-wise manner. The enhanced points can be used for extracting features through a 3D backbone. For the detection, we adopts a center-based method which means detecting the centers of objects by a keypoint detector and regressing other attributes, like 3D size, velocity, etc. In the tracking part, we use a simple but effective matching strategy, closest-point matching. According to the structure and characteristics of the whole framework, we name our model CLAMOT. Our experiments on nuScenes and Waymo benchmarks achieve competitive results.

查看原文本刊更多论文

CLAMOT:基于多模态特征聚合的三维检测和跟踪

在自动驾驶中，多目标跟踪(MOT)可以帮助车辆更好地感知周围环境并执行明智的运动规划。基于激光雷达的方法受到激光雷达点的稀疏性和探测范围的限制。为此，我们提出了一个名为CLA-fusion的相机和LiDAR聚合模块，以点为方向融合两个模态特征。增强点可用于通过三维主干提取特征。对于检测，我们采用基于中心的方法，即通过关键点检测器检测物体的中心，并回归其他属性，如3D尺寸，速度等。在跟踪部分，我们使用了一种简单而有效的匹配策略——最近点匹配。根据整个框架的结构和特点，我们将模型命名为CLAMOT。我们在nuScenes和Waymo基准上的实验取得了具有竞争力的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th International Conference on Image Processing and Machine Vision

自引率

0.00%

发文量