Spiking Transfer Learning From RGB Image to Neuromorphic Event Stream

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-07-23 DOI:10.1109/TIP.2024.3430043

Qiugang Zhan;Guisong Liu;Xiurui Xie;Ran Tao;Malu Zhang;Huajin Tang

{"title":"Spiking Transfer Learning From RGB Image to Neuromorphic Event Stream","authors":"Qiugang Zhan;Guisong Liu;Xiurui Xie;Ran Tao;Malu Zhang;Huajin Tang","doi":"10.1109/TIP.2024.3430043","DOIUrl":null,"url":null,"abstract":"Recent advances in bio-inspired vision with event cameras and associated spiking neural networks (SNNs) have provided promising solutions for low-power consumption neuromorphic tasks. However, as the research of event cameras is still in its infancy, the amount of labeled event stream data is much less than that of the RGB database. The traditional method of converting static images into event streams by simulation to increase the sample size cannot simulate the characteristics of event cameras such as high temporal resolution. To take advantage of both the rich knowledge in labeled RGB images and the features of the event camera, we propose a transfer learning method from the RGB to the event domain in this paper. Specifically, we first introduce a transfer learning framework named R2ETL (RGB to Event Transfer Learning), including a novel encoding alignment module and a feature alignment module. Then, we introduce the temporal centered kernel alignment (TCKA) loss function to improve the efficiency of transfer learning. It aligns the distribution of temporal neuron states by adding a temporal learning constraint. Finally, we theoretically analyze the amount of data required by the deep neuromorphic model to prove the necessity of our method. Numerous experiments demonstrate that our proposed framework outperforms the state-of-the-art SNN and artificial neural network (ANN) models trained on event streams, including N-MNIST, CIFAR10-DVS and N-Caltech101. This indicates that the R2ETL framework is able to leverage the knowledge of labeled RGB images to help the training of SNN on event streams.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10608063/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advances in bio-inspired vision with event cameras and associated spiking neural networks (SNNs) have provided promising solutions for low-power consumption neuromorphic tasks. However, as the research of event cameras is still in its infancy, the amount of labeled event stream data is much less than that of the RGB database. The traditional method of converting static images into event streams by simulation to increase the sample size cannot simulate the characteristics of event cameras such as high temporal resolution. To take advantage of both the rich knowledge in labeled RGB images and the features of the event camera, we propose a transfer learning method from the RGB to the event domain in this paper. Specifically, we first introduce a transfer learning framework named R2ETL (RGB to Event Transfer Learning), including a novel encoding alignment module and a feature alignment module. Then, we introduce the temporal centered kernel alignment (TCKA) loss function to improve the efficiency of transfer learning. It aligns the distribution of temporal neuron states by adding a temporal learning constraint. Finally, we theoretically analyze the amount of data required by the deep neuromorphic model to prove the necessity of our method. Numerous experiments demonstrate that our proposed framework outperforms the state-of-the-art SNN and artificial neural network (ANN) models trained on event streams, including N-MNIST, CIFAR10-DVS and N-Caltech101. This indicates that the R2ETL framework is able to leverage the knowledge of labeled RGB images to help the training of SNN on event streams.

查看原文本刊更多论文

从 RGB 图像到神经形态事件流的尖峰转移学习

最近，生物启发视觉领域的进展是利用事件相机和相关的尖峰神经网络（SNN），为低功耗神经形态任务提供了前景广阔的解决方案。然而，由于事件相机的研究仍处于起步阶段，标注的事件流数据量远低于 RGB 数据库。通过模拟将静态图像转换为事件流以增加样本量的传统方法无法模拟事件相机的高时间分辨率等特性。为了充分利用 RGB 图像中的丰富知识和事件摄像机的特点，我们在本文中提出了一种从 RGB 到事件域的迁移学习方法。具体来说，我们首先引入了一个名为 R2ETL（RGB 到事件迁移学习）的迁移学习框架，其中包括一个新颖的编码对齐模块和一个特征对齐模块。然后，我们引入了时间中心核对齐（TCKA）损失函数，以提高迁移学习的效率。它通过添加时空学习约束来对齐时空神经元状态的分布。最后，我们从理论上分析了深度神经形态模型所需的数据量，以证明我们的方法的必要性。大量实验证明，我们提出的框架优于在事件流上训练的最先进的 SNN 和人工神经网络 (ANN) 模型，包括 N-MNIST、CIFAR10-DVS 和 N-Caltech101。这表明 R2ETL 框架能够利用标注 RGB 图像的知识来帮助在事件流上训练 SNN。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量

文献相关原料

公司名称	产品信息	采购帮参考价格