自动驾驶车辆中的多目标检测与跟踪：增强关联计算及其多模态应用综述

IF 4.2 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ICT Express Pub Date : 2025-08-01 DOI:10.1016/j.icte.2025.06.005

Muhammad Adeel Altaf , Min Young Kim

{"title":"自动驾驶车辆中的多目标检测与跟踪：增强关联计算及其多模态应用综述","authors":"Muhammad Adeel Altaf , Min Young Kim","doi":"10.1016/j.icte.2025.06.005","DOIUrl":null,"url":null,"abstract":"<div><div>Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 4","pages":"Pages 809-818"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications\",\"authors\":\"Muhammad Adeel Altaf , Min Young Kim\",\"doi\":\"10.1016/j.icte.2025.06.005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.</div></div>\",\"PeriodicalId\":48526,\"journal\":{\"name\":\"ICT Express\",\"volume\":\"11 4\",\"pages\":\"Pages 809-818\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICT Express\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2405959525000803\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICT Express","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2405959525000803","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

三维（3D）目标跟踪在计算机视觉应用中至关重要，特别是在自动驾驶、机器人和监视中。尽管取得了进步，但有效利用多模态数据来改进多目标检测和跟踪（MODT）仍然具有挑战性。本研究介绍了ACMODT，一种基于亲和计算的多目标检测和跟踪框架，它集成了摄像头（2D）和激光雷达（3D）数据，以增强自动驾驶中的MODT性能。该方法利用EPNet作为主干，利用2D-3D特征融合来准确生成提案。深度神经网络（DNN）提取鲁棒的外观和几何特征，而改进的亲和计算模块结合了精炼升压相关特征（RBCF）和3d扩展几何IoU （3D-XGIoU）进行精确的对象关联。运动预测使用卡尔曼滤波（KF）进行细化，基于高斯混合模型（GMM）的数据关联确保一致的跟踪。在用于定量分析的KITTI汽车跟踪基准和用于可视化的辐射数据集上进行的实验表明，与最先进的多目标跟踪（MOT）方法相比，我们的方法具有更高的跟踪精度和精度，证明了其实时目标跟踪的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications

查看原文本刊更多论文

Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications

Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ICT Express Multiple-

CiteScore

10.20

自引率

1.90%

发文量

167

审稿时长

35 weeks

期刊介绍： The ICT Express journal published by the Korean Institute of Communications and Information Sciences (KICS) is an international, peer-reviewed research publication covering all aspects of information and communication technology. The journal aims to publish research that helps advance the theoretical and practical understanding of ICT convergence, platform technologies, communication networks, and device technologies. The technology advancement in information and communication technology (ICT) sector enables portable devices to be always connected while supporting high data rate, resulting in the recent popularity of smartphones that have a considerable impact in economic and social development.