{"title":"SNNTracker:在线高速多目标跟踪与Spike相机。","authors":"Yajing Zheng,Chengen Li,Jiyuan Zhang,Zhaofei Yu,Tiejun Huang","doi":"10.1109/tpami.2025.3610696","DOIUrl":null,"url":null,"abstract":"Multi-object tracking (MOT) is crucial for applications such as autonomous driving and robotics, yet traditional image-based methods struggle in high-speed scenarios due to motion blur and temporal gaps caused by low frame rates. Spike cameras, with their ability to continuously record spatiotemporal signals, overcome these limitations. However, existing spike-based methods often rely on intermediate image reconstruction or discrete clustering, which limits their real-time performance and temporal continuity. To address this, we propose SNNTracker, the first fully spiking neural network (SNN)-based MOT algorithm tailored for spike cameras. SNNTracker integrates a dynamic neural field (DNF)-based attention mechanism for target detection and a winner-take-all (WTA)-based tracking module with online spike-timing-dependent plasticity (STDP) for adaptive learning of object trajectories. By directly processing spike streams without reconstruction, SNNTracker reduces latency, computational overhead, and dependency on image quality, making it ideal for ultra-high-speed environments. It maintains robust, continuous tracking even under occlusions, severe lighting variations, or temporary object disappearance, by leveraging SNN-estimated motion predictions and long-term online clustering. We construct three types of spike-camera MOT datasets covering dense and sparse annotations across diverse real-world scenarios, including camera ego-motion, deformable and ultra-fast motion (up to 2600 RPM), occlusion, indoor/outdoor lighting changes, and low-visibility tracking. Extensive experiments demonstrate that SNNTracker consistently outperforms state-of-the-art MOT methods-both ANN- and SNN-based-achieving MOTA scores above 96% and up to 100% in many sequences. Our results highlight the advantages of spike-driven SNNs for low-latency, high-speed, and label-free multi-object tracking, advancing neuromorphic vision for real-time perception.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"171 1","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SNNTracker: Online High-speed Multi-Object Tracking with Spike Camera.\",\"authors\":\"Yajing Zheng,Chengen Li,Jiyuan Zhang,Zhaofei Yu,Tiejun Huang\",\"doi\":\"10.1109/tpami.2025.3610696\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-object tracking (MOT) is crucial for applications such as autonomous driving and robotics, yet traditional image-based methods struggle in high-speed scenarios due to motion blur and temporal gaps caused by low frame rates. Spike cameras, with their ability to continuously record spatiotemporal signals, overcome these limitations. However, existing spike-based methods often rely on intermediate image reconstruction or discrete clustering, which limits their real-time performance and temporal continuity. To address this, we propose SNNTracker, the first fully spiking neural network (SNN)-based MOT algorithm tailored for spike cameras. SNNTracker integrates a dynamic neural field (DNF)-based attention mechanism for target detection and a winner-take-all (WTA)-based tracking module with online spike-timing-dependent plasticity (STDP) for adaptive learning of object trajectories. By directly processing spike streams without reconstruction, SNNTracker reduces latency, computational overhead, and dependency on image quality, making it ideal for ultra-high-speed environments. It maintains robust, continuous tracking even under occlusions, severe lighting variations, or temporary object disappearance, by leveraging SNN-estimated motion predictions and long-term online clustering. We construct three types of spike-camera MOT datasets covering dense and sparse annotations across diverse real-world scenarios, including camera ego-motion, deformable and ultra-fast motion (up to 2600 RPM), occlusion, indoor/outdoor lighting changes, and low-visibility tracking. Extensive experiments demonstrate that SNNTracker consistently outperforms state-of-the-art MOT methods-both ANN- and SNN-based-achieving MOTA scores above 96% and up to 100% in many sequences. Our results highlight the advantages of spike-driven SNNs for low-latency, high-speed, and label-free multi-object tracking, advancing neuromorphic vision for real-time perception.\",\"PeriodicalId\":13426,\"journal\":{\"name\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"volume\":\"171 1\",\"pages\":\"\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tpami.2025.3610696\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tpami.2025.3610696","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
SNNTracker: Online High-speed Multi-Object Tracking with Spike Camera.
Multi-object tracking (MOT) is crucial for applications such as autonomous driving and robotics, yet traditional image-based methods struggle in high-speed scenarios due to motion blur and temporal gaps caused by low frame rates. Spike cameras, with their ability to continuously record spatiotemporal signals, overcome these limitations. However, existing spike-based methods often rely on intermediate image reconstruction or discrete clustering, which limits their real-time performance and temporal continuity. To address this, we propose SNNTracker, the first fully spiking neural network (SNN)-based MOT algorithm tailored for spike cameras. SNNTracker integrates a dynamic neural field (DNF)-based attention mechanism for target detection and a winner-take-all (WTA)-based tracking module with online spike-timing-dependent plasticity (STDP) for adaptive learning of object trajectories. By directly processing spike streams without reconstruction, SNNTracker reduces latency, computational overhead, and dependency on image quality, making it ideal for ultra-high-speed environments. It maintains robust, continuous tracking even under occlusions, severe lighting variations, or temporary object disappearance, by leveraging SNN-estimated motion predictions and long-term online clustering. We construct three types of spike-camera MOT datasets covering dense and sparse annotations across diverse real-world scenarios, including camera ego-motion, deformable and ultra-fast motion (up to 2600 RPM), occlusion, indoor/outdoor lighting changes, and low-visibility tracking. Extensive experiments demonstrate that SNNTracker consistently outperforms state-of-the-art MOT methods-both ANN- and SNN-based-achieving MOTA scores above 96% and up to 100% in many sequences. Our results highlight the advantages of spike-driven SNNs for low-latency, high-speed, and label-free multi-object tracking, advancing neuromorphic vision for real-time perception.
期刊介绍:
The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.