Event Tubelet Compressor: Generating Compact Representations for Event-Based Action Recognition

Bochen Xie, Yongjian Deng, Z. Shao, Hai Liu, Qingsong Xu, Youfu Li
{"title":"Event Tubelet Compressor: Generating Compact Representations for Event-Based Action Recognition","authors":"Bochen Xie, Yongjian Deng, Z. Shao, Hai Liu, Qingsong Xu, Youfu Li","doi":"10.1109/CRC55853.2022.10041200","DOIUrl":null,"url":null,"abstract":"Event cameras asynchronously capture pixel-level intensity changes in scenes and output a stream of events. Compared with traditional frame-based cameras, they can offer competitive imaging characteristics: low latency, high dynamic range, and low power consumption. It means that event cameras are ideal for vision tasks in dynamic scenarios, such as human action recognition. The best-performing event-based algorithms convert events into frame-based representations and feed them into existing learning models. However, generating informative frames for long-duration event streams is still a challenge since event cameras work asynchronously without a fixed frame rate. In this work, we propose a novel frame-based representation named Compact Event Image (CEI) for action recognition. This representation is generated by a self-attention based module named Event Tubelet Compressor (EVTC) in a learnable way. The EVTC module adaptively summarizes the long-term dynamics and temporal patterns of events into a CEI frame set. We can combine EVTC with conventional video backbones for end-to-end event-based action recognition. We evaluate our approach on three benchmark datasets, and experimental results show it outperforms state-of-the-art methods by a large margin.","PeriodicalId":275933,"journal":{"name":"2022 7th International Conference on Control, Robotics and Cybernetics (CRC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Control, Robotics and Cybernetics (CRC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRC55853.2022.10041200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Event cameras asynchronously capture pixel-level intensity changes in scenes and output a stream of events. Compared with traditional frame-based cameras, they can offer competitive imaging characteristics: low latency, high dynamic range, and low power consumption. It means that event cameras are ideal for vision tasks in dynamic scenarios, such as human action recognition. The best-performing event-based algorithms convert events into frame-based representations and feed them into existing learning models. However, generating informative frames for long-duration event streams is still a challenge since event cameras work asynchronously without a fixed frame rate. In this work, we propose a novel frame-based representation named Compact Event Image (CEI) for action recognition. This representation is generated by a self-attention based module named Event Tubelet Compressor (EVTC) in a learnable way. The EVTC module adaptively summarizes the long-term dynamics and temporal patterns of events into a CEI frame set. We can combine EVTC with conventional video backbones for end-to-end event-based action recognition. We evaluate our approach on three benchmark datasets, and experimental results show it outperforms state-of-the-art methods by a large margin.
事件管状压缩器:为基于事件的动作识别生成紧凑的表示
事件相机异步捕捉场景中的像素级强度变化并输出事件流。与传统的基于帧的相机相比,它们可以提供具有竞争力的成像特性:低延迟、高动态范围和低功耗。这意味着事件相机是动态场景中视觉任务的理想选择,比如人类行为识别。性能最好的基于事件的算法将事件转换为基于框架的表示,并将其提供给现有的学习模型。然而,为长时间的事件流生成信息帧仍然是一个挑战,因为事件相机在没有固定帧率的情况下异步工作。在这项工作中,我们提出了一种新的基于帧的表示,称为紧凑事件图像(CEI),用于动作识别。这种表示由一个名为事件管流压缩器(EVTC)的基于自关注的模块以可学习的方式生成。EVTC模块自适应地将事件的长期动态和时间模式总结为CEI框架集。我们可以将EVTC与传统视频骨干网相结合,实现端到端基于事件的动作识别。我们在三个基准数据集上评估了我们的方法,实验结果表明它在很大程度上优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信