Daniel Deniz, E. Ros, C. Fermüller, Francisco Barranco
{"title":"When Do Neuromorphic Sensors Outperform cameras? Learning from Dynamic Features","authors":"Daniel Deniz, E. Ros, C. Fermüller, Francisco Barranco","doi":"10.1109/CISS56502.2023.10089678","DOIUrl":null,"url":null,"abstract":"Visual event sensors only output data when changes in the scene happen at very high frequency. This allows for smartly compressing the scene and thus, enabling real-time operation. Despite these advantages, works in the literature have struggled to show a niche for these event-driven approaches compared to conventional sensors, especially when focusing on accuracy performance. In this work, we show a case that fully exploits event sensor advantages: for manipulation action recognition, learning events achieves superior accuracy and time performance. The recognition of manipulation actions requires extracting and learning features from the hand pose and trajectory and the interaction with the object. As shown in our work, approaches based on event sensors are the best fit for extracting these dynamic features contrarily to conventional approaches based on full frames, which mostly extract spatial features and need to reconstruct the dynamics from sequences of frames. Finally, we show how using a tracker to extract the features to be learned only around the hand, we obtain an approach that is scene- and almost object-agnostic and achieves good time performance with a very limited impact in accuracy.","PeriodicalId":243775,"journal":{"name":"2023 57th Annual Conference on Information Sciences and Systems (CISS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 57th Annual Conference on Information Sciences and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS56502.2023.10089678","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Visual event sensors only output data when changes in the scene happen at very high frequency. This allows for smartly compressing the scene and thus, enabling real-time operation. Despite these advantages, works in the literature have struggled to show a niche for these event-driven approaches compared to conventional sensors, especially when focusing on accuracy performance. In this work, we show a case that fully exploits event sensor advantages: for manipulation action recognition, learning events achieves superior accuracy and time performance. The recognition of manipulation actions requires extracting and learning features from the hand pose and trajectory and the interaction with the object. As shown in our work, approaches based on event sensors are the best fit for extracting these dynamic features contrarily to conventional approaches based on full frames, which mostly extract spatial features and need to reconstruct the dynamics from sequences of frames. Finally, we show how using a tracker to extract the features to be learned only around the hand, we obtain an approach that is scene- and almost object-agnostic and achieves good time performance with a very limited impact in accuracy.