Zanxi Ruan , Nan Pu , Jiangming Chen , Songqun Gao , Yanming Guo , Qiuyu Kong , Yuxiang Xie , Yingmei Wei
{"title":"基于少量事件的动作识别","authors":"Zanxi Ruan , Nan Pu , Jiangming Chen , Songqun Gao , Yanming Guo , Qiuyu Kong , Yuxiang Xie , Yingmei Wei","doi":"10.1016/j.neunet.2025.107750","DOIUrl":null,"url":null,"abstract":"<div><div>Despite the evident superiority of event cameras in practical vision applications (e.g., action recognition), owing to their distinctive sensing mechanism, existing event-based action recognition methods rely heavily on large-scale training data. However, the expensive cost of camera deployment and the requirement of data privacy protection make it challenging to collect substantial data in real-world scenarios. To address this limitation, we explore a novel yet practical task, Few-Shot Event-Based Action Recognition (FSEAR), which aims at leveraging a minimal number of intractable event action data for model training and accurately classifying unlabeled data into a specific category. Accordingly, we design a new framework for FSEAR, including a Noise-Aware Event Encoder (NAE) and a Distilled Prototypical Distance Fusion (DPDF). The former efficiently filters noise within the spatiotemporal domain while retaining vital information related to action timing. The latter conducts multi-scale measurements across geometric, directional, and distributional dimensions. These two modules benefit mutually and thus effectively exploit the potential characteristics of event data. Extensive experiments on four distinct event action recognition datasets have demonstrated the significant advantages of our model over other few-shot learning methods. Our code and models will be publicly released.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107750"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Few-shot event-based action recognition\",\"authors\":\"Zanxi Ruan , Nan Pu , Jiangming Chen , Songqun Gao , Yanming Guo , Qiuyu Kong , Yuxiang Xie , Yingmei Wei\",\"doi\":\"10.1016/j.neunet.2025.107750\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Despite the evident superiority of event cameras in practical vision applications (e.g., action recognition), owing to their distinctive sensing mechanism, existing event-based action recognition methods rely heavily on large-scale training data. However, the expensive cost of camera deployment and the requirement of data privacy protection make it challenging to collect substantial data in real-world scenarios. To address this limitation, we explore a novel yet practical task, Few-Shot Event-Based Action Recognition (FSEAR), which aims at leveraging a minimal number of intractable event action data for model training and accurately classifying unlabeled data into a specific category. Accordingly, we design a new framework for FSEAR, including a Noise-Aware Event Encoder (NAE) and a Distilled Prototypical Distance Fusion (DPDF). The former efficiently filters noise within the spatiotemporal domain while retaining vital information related to action timing. The latter conducts multi-scale measurements across geometric, directional, and distributional dimensions. These two modules benefit mutually and thus effectively exploit the potential characteristics of event data. Extensive experiments on four distinct event action recognition datasets have demonstrated the significant advantages of our model over other few-shot learning methods. Our code and models will be publicly released.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"191 \",\"pages\":\"Article 107750\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025006306\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025006306","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Despite the evident superiority of event cameras in practical vision applications (e.g., action recognition), owing to their distinctive sensing mechanism, existing event-based action recognition methods rely heavily on large-scale training data. However, the expensive cost of camera deployment and the requirement of data privacy protection make it challenging to collect substantial data in real-world scenarios. To address this limitation, we explore a novel yet practical task, Few-Shot Event-Based Action Recognition (FSEAR), which aims at leveraging a minimal number of intractable event action data for model training and accurately classifying unlabeled data into a specific category. Accordingly, we design a new framework for FSEAR, including a Noise-Aware Event Encoder (NAE) and a Distilled Prototypical Distance Fusion (DPDF). The former efficiently filters noise within the spatiotemporal domain while retaining vital information related to action timing. The latter conducts multi-scale measurements across geometric, directional, and distributional dimensions. These two modules benefit mutually and thus effectively exploit the potential characteristics of event data. Extensive experiments on four distinct event action recognition datasets have demonstrated the significant advantages of our model over other few-shot learning methods. Our code and models will be publicly released.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.