{"title":"基于事件相机的时空金字塔关键点检测","authors":"Yuqing Zhu;Yuan Gao;Tianle Ding;Xiang Liu;Wenfei Yang;Tianzhu Zhang","doi":"10.1109/TCSVT.2025.3559299","DOIUrl":null,"url":null,"abstract":"Event cameras are bio-inspired sensors with diverse advantages, including high temporal resolution and minimal power consumption. Therefore, event cameras enjoy a wide range of applications in computer vision, among which event keypoint detection plays a vital role. However, repeatable event keypoint detection remains challenging because the lack of temporal inter-frame interaction leads to descriptors with limited temporal consistency, which restricts the ability to perceive keypoint motion. Besides, detectors learned at single scale features are not suitable for event keypoints with significant motion speed differences in high-speed scenarios. To deal with these problems, we propose a novel Spatio-Temporal Pyramid Keypoint Detection Network (STPNet) for event cameras via a temporally consistent descriptor learning (TCL) module and a spatially diverse detector learning (SDL) module. The proposed STPNet enjoys several merits. First, the TCL module generates temporally consistent descriptors for specific keypoint motion patterns. Second, the SDL module produces spatially diverse detectors for applications in high-speed motion scenarios. Extensive experimental results on three challenging benchmarks show that our method notably outperforms state-of-the-art event keypoint detection methods. Specifically, our STPNet can outperform the best event keypoint detection method by 0.21px in reprj. error on Event-Camera, 4% in IoU on N-Caltech101, 0.13px in reprj. error on HVGA ATIS Corner and 5.94% in matching accuracy on DSEC.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9384-9397"},"PeriodicalIF":11.1000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spatio-Temporal Pyramid Keypoint Detection With Event Cameras\",\"authors\":\"Yuqing Zhu;Yuan Gao;Tianle Ding;Xiang Liu;Wenfei Yang;Tianzhu Zhang\",\"doi\":\"10.1109/TCSVT.2025.3559299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Event cameras are bio-inspired sensors with diverse advantages, including high temporal resolution and minimal power consumption. Therefore, event cameras enjoy a wide range of applications in computer vision, among which event keypoint detection plays a vital role. However, repeatable event keypoint detection remains challenging because the lack of temporal inter-frame interaction leads to descriptors with limited temporal consistency, which restricts the ability to perceive keypoint motion. Besides, detectors learned at single scale features are not suitable for event keypoints with significant motion speed differences in high-speed scenarios. To deal with these problems, we propose a novel Spatio-Temporal Pyramid Keypoint Detection Network (STPNet) for event cameras via a temporally consistent descriptor learning (TCL) module and a spatially diverse detector learning (SDL) module. The proposed STPNet enjoys several merits. First, the TCL module generates temporally consistent descriptors for specific keypoint motion patterns. Second, the SDL module produces spatially diverse detectors for applications in high-speed motion scenarios. Extensive experimental results on three challenging benchmarks show that our method notably outperforms state-of-the-art event keypoint detection methods. Specifically, our STPNet can outperform the best event keypoint detection method by 0.21px in reprj. error on Event-Camera, 4% in IoU on N-Caltech101, 0.13px in reprj. error on HVGA ATIS Corner and 5.94% in matching accuracy on DSEC.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 9\",\"pages\":\"9384-9397\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2025-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10960429/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10960429/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Spatio-Temporal Pyramid Keypoint Detection With Event Cameras
Event cameras are bio-inspired sensors with diverse advantages, including high temporal resolution and minimal power consumption. Therefore, event cameras enjoy a wide range of applications in computer vision, among which event keypoint detection plays a vital role. However, repeatable event keypoint detection remains challenging because the lack of temporal inter-frame interaction leads to descriptors with limited temporal consistency, which restricts the ability to perceive keypoint motion. Besides, detectors learned at single scale features are not suitable for event keypoints with significant motion speed differences in high-speed scenarios. To deal with these problems, we propose a novel Spatio-Temporal Pyramid Keypoint Detection Network (STPNet) for event cameras via a temporally consistent descriptor learning (TCL) module and a spatially diverse detector learning (SDL) module. The proposed STPNet enjoys several merits. First, the TCL module generates temporally consistent descriptors for specific keypoint motion patterns. Second, the SDL module produces spatially diverse detectors for applications in high-speed motion scenarios. Extensive experimental results on three challenging benchmarks show that our method notably outperforms state-of-the-art event keypoint detection methods. Specifically, our STPNet can outperform the best event keypoint detection method by 0.21px in reprj. error on Event-Camera, 4% in IoU on N-Caltech101, 0.13px in reprj. error on HVGA ATIS Corner and 5.94% in matching accuracy on DSEC.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.