自动驾驶车辆中的多目标检测与跟踪:增强关联计算及其多模态应用综述

IF 4.2 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Muhammad Adeel Altaf , Min Young Kim
{"title":"自动驾驶车辆中的多目标检测与跟踪:增强关联计算及其多模态应用综述","authors":"Muhammad Adeel Altaf ,&nbsp;Min Young Kim","doi":"10.1016/j.icte.2025.06.005","DOIUrl":null,"url":null,"abstract":"<div><div>Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.</div></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"11 4","pages":"Pages 809-818"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications\",\"authors\":\"Muhammad Adeel Altaf ,&nbsp;Min Young Kim\",\"doi\":\"10.1016/j.icte.2025.06.005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.</div></div>\",\"PeriodicalId\":48526,\"journal\":{\"name\":\"ICT Express\",\"volume\":\"11 4\",\"pages\":\"Pages 809-818\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICT Express\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2405959525000803\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICT Express","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2405959525000803","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

三维(3D)目标跟踪在计算机视觉应用中至关重要,特别是在自动驾驶、机器人和监视中。尽管取得了进步,但有效利用多模态数据来改进多目标检测和跟踪(MODT)仍然具有挑战性。本研究介绍了ACMODT,一种基于亲和计算的多目标检测和跟踪框架,它集成了摄像头(2D)和激光雷达(3D)数据,以增强自动驾驶中的MODT性能。该方法利用EPNet作为主干,利用2D-3D特征融合来准确生成提案。深度神经网络(DNN)提取鲁棒的外观和几何特征,而改进的亲和计算模块结合了精炼升压相关特征(RBCF)和3d扩展几何IoU (3D-XGIoU)进行精确的对象关联。运动预测使用卡尔曼滤波(KF)进行细化,基于高斯混合模型(GMM)的数据关联确保一致的跟踪。在用于定量分析的KITTI汽车跟踪基准和用于可视化的辐射数据集上进行的实验表明,与最先进的多目标跟踪(MOT)方法相比,我们的方法具有更高的跟踪精度和精度,证明了其实时目标跟踪的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications

Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications
Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ICT Express
ICT Express Multiple-
CiteScore
10.20
自引率
1.90%
发文量
167
审稿时长
35 weeks
期刊介绍: The ICT Express journal published by the Korean Institute of Communications and Information Sciences (KICS) is an international, peer-reviewed research publication covering all aspects of information and communication technology. The journal aims to publish research that helps advance the theoretical and practical understanding of ICT convergence, platform technologies, communication networks, and device technologies. The technology advancement in information and communication technology (ICT) sector enables portable devices to be always connected while supporting high data rate, resulting in the recent popularity of smartphones that have a considerable impact in economic and social development.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信