稀疏点云体素柱多帧交叉关注网络鲁棒单目标跟踪

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Luda Zhao , Yihua Hu , Xing Yang , Yicheng Wang , Zhenglei Dou , Yan Zhang
{"title":"稀疏点云体素柱多帧交叉关注网络鲁棒单目标跟踪","authors":"Luda Zhao ,&nbsp;Yihua Hu ,&nbsp;Xing Yang ,&nbsp;Yicheng Wang ,&nbsp;Zhenglei Dou ,&nbsp;Yan Zhang","doi":"10.1016/j.patcog.2025.111771","DOIUrl":null,"url":null,"abstract":"<div><div>Single object tracking (SOT) within dynamic point cloud sequences is critically important in autonomous driving, remote sensing navigation, and smart industrial applications, etc. Point cloud collected via various LiDAR becomes sparse due to sensor-related and environmental disturbances, leading to tracking inaccuracies driven by the limited robustness of existing SOT algorithms. To mitigate these challenges, we propose a Voxel Pillar Multi-frame Cross Attention Network (VPMCAN) designed for sparse point cloud robust tracking. VPMCAN employs a voxel-based encoding of pillar information for feature extraction and utilizes a dense pyramid network for the extraction of multi-scale sparse data. The integration of multi-frame and cross-attention mechanisms during feature fusion allows for an effective balance between global and local features, significantly enhancing the target’s long-term tracking robustness. Additionally, VPMCAN’s design prioritizes lightweight architecture, to ensure hardware-friendly implementation. To showcase its efficacy, we constructed a maritime point cloud video sequences dataset and conducted extensive experiments across KITTI, nuScenes and Waymo datasets. Results reveal VPMCAN’s optimal performance in non-sparse scenes and a remarkable 32.5% improvement over state-of-the-art algorithms in sparse scenes, averaging over a 20% performance increase. This highlights the efficacy of the lightweight point cloud SOT algorithm in robustly tracking sparse targets, suggesting promising practical applications.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"167 ","pages":"Article 111771"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Voxel Pillar Multi-frame Cross Attention Network for sparse point cloud robust single object tracking\",\"authors\":\"Luda Zhao ,&nbsp;Yihua Hu ,&nbsp;Xing Yang ,&nbsp;Yicheng Wang ,&nbsp;Zhenglei Dou ,&nbsp;Yan Zhang\",\"doi\":\"10.1016/j.patcog.2025.111771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Single object tracking (SOT) within dynamic point cloud sequences is critically important in autonomous driving, remote sensing navigation, and smart industrial applications, etc. Point cloud collected via various LiDAR becomes sparse due to sensor-related and environmental disturbances, leading to tracking inaccuracies driven by the limited robustness of existing SOT algorithms. To mitigate these challenges, we propose a Voxel Pillar Multi-frame Cross Attention Network (VPMCAN) designed for sparse point cloud robust tracking. VPMCAN employs a voxel-based encoding of pillar information for feature extraction and utilizes a dense pyramid network for the extraction of multi-scale sparse data. The integration of multi-frame and cross-attention mechanisms during feature fusion allows for an effective balance between global and local features, significantly enhancing the target’s long-term tracking robustness. Additionally, VPMCAN’s design prioritizes lightweight architecture, to ensure hardware-friendly implementation. To showcase its efficacy, we constructed a maritime point cloud video sequences dataset and conducted extensive experiments across KITTI, nuScenes and Waymo datasets. Results reveal VPMCAN’s optimal performance in non-sparse scenes and a remarkable 32.5% improvement over state-of-the-art algorithms in sparse scenes, averaging over a 20% performance increase. This highlights the efficacy of the lightweight point cloud SOT algorithm in robustly tracking sparse targets, suggesting promising practical applications.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"167 \",\"pages\":\"Article 111771\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325004315\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325004315","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

动态点云序列中的单目标跟踪(SOT)在自动驾驶、遥感导航和智能工业等应用中至关重要。由于传感器相关和环境干扰,通过各种激光雷达收集的点云变得稀疏,导致现有SOT算法鲁棒性有限导致跟踪不准确。为了缓解这些挑战,我们提出了一种用于稀疏点云鲁棒跟踪的体素柱多帧交叉注意网络(VPMCAN)。VPMCAN采用基于体素的柱信息编码进行特征提取,利用密集金字塔网络提取多尺度稀疏数据。在特征融合过程中,多帧和交叉注意机制的集成使得全局和局部特征之间的有效平衡,显著增强了目标的长期跟踪鲁棒性。此外,VPMCAN的设计优先考虑轻量级架构,以确保硬件友好的实现。为了展示其有效性,我们构建了一个海上点云视频序列数据集,并在KITTI、nuScenes和Waymo数据集上进行了广泛的实验。结果表明,VPMCAN在非稀疏场景中具有最佳性能,在稀疏场景中比最先进的算法提高了32.5%,平均性能提高了20%以上。这突出了轻量级点云SOT算法在鲁棒跟踪稀疏目标方面的有效性,表明了有前景的实际应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Voxel Pillar Multi-frame Cross Attention Network for sparse point cloud robust single object tracking
Single object tracking (SOT) within dynamic point cloud sequences is critically important in autonomous driving, remote sensing navigation, and smart industrial applications, etc. Point cloud collected via various LiDAR becomes sparse due to sensor-related and environmental disturbances, leading to tracking inaccuracies driven by the limited robustness of existing SOT algorithms. To mitigate these challenges, we propose a Voxel Pillar Multi-frame Cross Attention Network (VPMCAN) designed for sparse point cloud robust tracking. VPMCAN employs a voxel-based encoding of pillar information for feature extraction and utilizes a dense pyramid network for the extraction of multi-scale sparse data. The integration of multi-frame and cross-attention mechanisms during feature fusion allows for an effective balance between global and local features, significantly enhancing the target’s long-term tracking robustness. Additionally, VPMCAN’s design prioritizes lightweight architecture, to ensure hardware-friendly implementation. To showcase its efficacy, we constructed a maritime point cloud video sequences dataset and conducted extensive experiments across KITTI, nuScenes and Waymo datasets. Results reveal VPMCAN’s optimal performance in non-sparse scenes and a remarkable 32.5% improvement over state-of-the-art algorithms in sparse scenes, averaging over a 20% performance increase. This highlights the efficacy of the lightweight point cloud SOT algorithm in robustly tracking sparse targets, suggesting promising practical applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信