基于事件的图像分类的多维注意力峰值转换器

2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE) Pub Date : 2023-04-14 DOI:10.1109/CISCE58541.2023.10142563

Lin Li, Yang Liu

{"title":"基于事件的图像分类的多维注意力峰值转换器","authors":"Lin Li, Yang Liu","doi":"10.1109/CISCE58541.2023.10142563","DOIUrl":null,"url":null,"abstract":"Image classification is a vital research area in deep learning. However, the use of Artificial Neural Networks (ANNs) in conventional approaches requires vast computational power and memory. As a potential energy-efficient alternative, Spiking Neural Networks (SNNs) leverage temporal information and low-power sensors. Nonetheless, extracting spatio-temporal features from event-based image sequences for improved classification accuracies in SNNs poses a significant challenge. To address this, we propose a Multi-Dimensional Attention Spiking Transformer (MAST) model that integrates attention mechanisms and SNNs to capture spatio-temporal features in event-based image sequences. Consequently, the MAST model achieves state-of-the-art performance in various classification tasks, as shown by the evaluations on the CIFAR, DVS128 Gesture, and CIFAR10-DVS datasets. Overall, MAST exhibits promise in event-based image classification tasks, providing a new perspective on the integration of attention mechanisms and SNNs for improved image classification.","PeriodicalId":145263,"journal":{"name":"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multi-dimensional Attention Spiking Transformer for Event-based Image Classification\",\"authors\":\"Lin Li, Yang Liu\",\"doi\":\"10.1109/CISCE58541.2023.10142563\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image classification is a vital research area in deep learning. However, the use of Artificial Neural Networks (ANNs) in conventional approaches requires vast computational power and memory. As a potential energy-efficient alternative, Spiking Neural Networks (SNNs) leverage temporal information and low-power sensors. Nonetheless, extracting spatio-temporal features from event-based image sequences for improved classification accuracies in SNNs poses a significant challenge. To address this, we propose a Multi-Dimensional Attention Spiking Transformer (MAST) model that integrates attention mechanisms and SNNs to capture spatio-temporal features in event-based image sequences. Consequently, the MAST model achieves state-of-the-art performance in various classification tasks, as shown by the evaluations on the CIFAR, DVS128 Gesture, and CIFAR10-DVS datasets. Overall, MAST exhibits promise in event-based image classification tasks, providing a new perspective on the integration of attention mechanisms and SNNs for improved image classification.\",\"PeriodicalId\":145263,\"journal\":{\"name\":\"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISCE58541.2023.10142563\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISCE58541.2023.10142563","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

图像分类是深度学习的一个重要研究领域。然而，在传统方法中使用人工神经网络(ann)需要巨大的计算能力和内存。作为潜在的节能替代方案，脉冲神经网络(snn)利用了时间信息和低功耗传感器。然而，从基于事件的图像序列中提取时空特征以提高snn的分类精度是一个重大挑战。为了解决这个问题，我们提出了一个多维注意峰值转换器(MAST)模型，该模型集成了注意机制和snn，以捕获基于事件的图像序列中的时空特征。因此，MAST模型在各种分类任务中实现了最先进的性能，如对CIFAR、DVS128 Gesture和CIFAR10-DVS数据集的评估所示。总体而言，MAST在基于事件的图像分类任务中表现出前景，为注意力机制和snn的整合提供了一个新的视角，以改进图像分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-dimensional Attention Spiking Transformer for Event-based Image Classification

Image classification is a vital research area in deep learning. However, the use of Artificial Neural Networks (ANNs) in conventional approaches requires vast computational power and memory. As a potential energy-efficient alternative, Spiking Neural Networks (SNNs) leverage temporal information and low-power sensors. Nonetheless, extracting spatio-temporal features from event-based image sequences for improved classification accuracies in SNNs poses a significant challenge. To address this, we propose a Multi-Dimensional Attention Spiking Transformer (MAST) model that integrates attention mechanisms and SNNs to capture spatio-temporal features in event-based image sequences. Consequently, the MAST model achieves state-of-the-art performance in various classification tasks, as shown by the evaluations on the CIFAR, DVS128 Gesture, and CIFAR10-DVS datasets. Overall, MAST exhibits promise in event-based image classification tasks, providing a new perspective on the integration of attention mechanisms and SNNs for improved image classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)

自引率

0.00%

发文量