MSC-transformer-based 3D-attention with knowledge distillation for multi-action classification of separate lower limbs

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-06-25 DOI:10.1016/j.neunet.2025.107806

Heng Yan , Zilu Wang , Junhua Li

{"title":"MSC-transformer-based 3D-attention with knowledge distillation for multi-action classification of separate lower limbs","authors":"Heng Yan , Zilu Wang , Junhua Li","doi":"10.1016/j.neunet.2025.107806","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning has been extensively applied to motor imagery (MI) classification using electroencephalogram (EEG). However, most existing deep learning models do not extract features from EEG using dimension-specific attention mechanisms based on the characteristics of each dimension (e.g., spatial dimension), while effectively integrate local and global features. Furthermore, implicit information generated by the models has been ignored, leading to underutilization of essential information of EEG. Although MI classification has been relatively thoroughly investigated, the exploration of classification including real movement (RM) and motor observation (MO) is very limited, especially for separate lower limbs. To address the above problems and limitations, we proposed a multi-scale separable convolutional Transformer-based filter-spatial-temporal attention model (MSC-T3AM) to classify multiple lower limb actions. In MSC-T3AM, spatial attention, filter and temporal attention modules are embedded to allocate appropriate attention to each dimension. Multi-scale separable convolutions (MSC) are separately applied after the projections of query, key, and value in self-attention module to improve computational efficiency and classification performance. Furthermore, knowledge distillation (KD) was utilized to help model learn suitable probability distribution. The comparison results demonstrated that MSC-T3AM with online KD achieved best performance in classification accuracy, exhibiting an elevation of 2 %-19 % compared to a few counterpart models. The visualization of features extracted by MSC-T3AM with online KD reiterated the superiority of the proposed model. The ablation results showed that filter and temporal attention modules contributed most for performance improvement (improved by 2.8 %), followed by spatial attention module (1.2 %) and MSC module (1 %). Our study also suggested that online KD was better than offline KD and the case without KD. The code of MSC-T3AM is available at: <span><span>https://github.com/BICN001/MSC-T3AM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107806"},"PeriodicalIF":6.3000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025006860","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning has been extensively applied to motor imagery (MI) classification using electroencephalogram (EEG). However, most existing deep learning models do not extract features from EEG using dimension-specific attention mechanisms based on the characteristics of each dimension (e.g., spatial dimension), while effectively integrate local and global features. Furthermore, implicit information generated by the models has been ignored, leading to underutilization of essential information of EEG. Although MI classification has been relatively thoroughly investigated, the exploration of classification including real movement (RM) and motor observation (MO) is very limited, especially for separate lower limbs. To address the above problems and limitations, we proposed a multi-scale separable convolutional Transformer-based filter-spatial-temporal attention model (MSC-T3AM) to classify multiple lower limb actions. In MSC-T3AM, spatial attention, filter and temporal attention modules are embedded to allocate appropriate attention to each dimension. Multi-scale separable convolutions (MSC) are separately applied after the projections of query, key, and value in self-attention module to improve computational efficiency and classification performance. Furthermore, knowledge distillation (KD) was utilized to help model learn suitable probability distribution. The comparison results demonstrated that MSC-T3AM with online KD achieved best performance in classification accuracy, exhibiting an elevation of 2 %-19 % compared to a few counterpart models. The visualization of features extracted by MSC-T3AM with online KD reiterated the superiority of the proposed model. The ablation results showed that filter and temporal attention modules contributed most for performance improvement (improved by 2.8 %), followed by spatial attention module (1.2 %) and MSC module (1 %). Our study also suggested that online KD was better than offline KD and the case without KD. The code of MSC-T3AM is available at: https://github.com/BICN001/MSC-T3AM.

查看原文本刊更多论文

基于msc变压器的三维注意力与知识精馏的下肢分离多动作分类

深度学习已被广泛应用于基于脑电图（EEG）的运动意象（MI）分类。然而，大多数现有的深度学习模型并没有基于每个维度（如空间维度）的特征，使用特定维度的注意机制从EEG中提取特征，而是有效地将局部特征和全局特征结合起来。此外，模型产生的隐式信息被忽略，导致脑电本质信息未得到充分利用。虽然MI的分类研究已经比较深入，但包括真实运动（real movement， RM）和运动观察（motor observation， MO）的分类探索非常有限，特别是对单独下肢的分类。为了解决上述问题和局限性，我们提出了一种基于多尺度可分离卷积变压器的滤波-时空注意模型（MSC-T3AM）来对多个下肢动作进行分类。在MSC-T3AM中，嵌入了空间注意、滤波和时间注意模块，为每个维度分配适当的注意力。在自关注模块中，对查询、关键字和值进行投影后，分别使用多尺度可分离卷积（MSC）来提高计算效率和分类性能。此外，利用知识蒸馏（KD）帮助模型学习合适的概率分布。对比结果表明，具有在线KD的MSC-T3AM在分类精度方面取得了最好的成绩，与少数同类模型相比，分类精度提高了2% - 19%。利用在线KD对MSC-T3AM提取的特征进行可视化，再次证明了该模型的优越性。消融结果显示，滤光模块和时间注意模块对性能改善的贡献最大（提高2.8%），其次是空间注意模块（提高1.2%）和MSC模块（提高1%）。我们的研究还表明，在线KD优于离线KD和没有KD的情况。MSC-T3AM的代码可在https://github.com/BICN001/MSC-T3AM上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.