Haitian Zhang;Chang Xu;Xinya Wang;Bingde Liu;Guang Hua;Lei Yu;Wen Yang
{"title":"Detecting Every Object From Events","authors":"Haitian Zhang;Chang Xu;Xinya Wang;Bingde Liu;Guang Hua;Lei Yu;Wen Yang","doi":"10.1109/TPAMI.2025.3565102","DOIUrl":null,"url":null,"abstract":"Object detection is critical in autonomous driving, and it is more practical yet challenging to localize objects of unknown categories: an endeavour known as Class-Agnostic Object Detection (CAOD). Existing studies on CAOD predominantly rely on RGB cameras, but these frame-based sensors usually have high latency and limited dynamic range, leading to safety risks under extreme conditions like fast-moving objects, overexposure, and darkness. In this study, we turn to the event-based vision, featured by its sub-millisecond latency and high dynamic range, for robust CAOD. We propose Detecting Every Object in Events (DEOE), an approach aimed at achieving high-speed, class-agnostic object detection in event-based vision. Built upon the fast event-based backbone: recurrent vision transformer, we jointly consider the spatial and temporal consistencies to identify potential objects. The discovered potential objects are assimilated as soft positive samples to avoid being suppressed as backgrounds. Moreover, we introduce a disentangled objectness head to separate the foreground-background classification and novel object discovery tasks, enhancing the model's generalization in localizing novel objects while maintaining a strong ability to filter out the background. Extensive experiments confirm the superiority of our proposed DEOE in both open-set and closed-set settings, outperforming strong baseline methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 8","pages":"7171-7178"},"PeriodicalIF":18.6000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10979499/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Object detection is critical in autonomous driving, and it is more practical yet challenging to localize objects of unknown categories: an endeavour known as Class-Agnostic Object Detection (CAOD). Existing studies on CAOD predominantly rely on RGB cameras, but these frame-based sensors usually have high latency and limited dynamic range, leading to safety risks under extreme conditions like fast-moving objects, overexposure, and darkness. In this study, we turn to the event-based vision, featured by its sub-millisecond latency and high dynamic range, for robust CAOD. We propose Detecting Every Object in Events (DEOE), an approach aimed at achieving high-speed, class-agnostic object detection in event-based vision. Built upon the fast event-based backbone: recurrent vision transformer, we jointly consider the spatial and temporal consistencies to identify potential objects. The discovered potential objects are assimilated as soft positive samples to avoid being suppressed as backgrounds. Moreover, we introduce a disentangled objectness head to separate the foreground-background classification and novel object discovery tasks, enhancing the model's generalization in localizing novel objects while maintaining a strong ability to filter out the background. Extensive experiments confirm the superiority of our proposed DEOE in both open-set and closed-set settings, outperforming strong baseline methods.