TFCNN-BiGRU with self-attention mechanism for automatic human emotion recognition using multi-channel EEG data

Cluster Computing Pub Date : 2024-07-19 DOI:10.1007/s10586-024-04590-5

Essam H. Houssein, Asmaa Hammad, Nagwan Abdel Samee, Manal Abdullah Alohali, Abdelmgeid A. Ali

{"title":"TFCNN-BiGRU with self-attention mechanism for automatic human emotion recognition using multi-channel EEG data","authors":"Essam H. Houssein, Asmaa Hammad, Nagwan Abdel Samee, Manal Abdullah Alohali, Abdelmgeid A. Ali","doi":"10.1007/s10586-024-04590-5","DOIUrl":null,"url":null,"abstract":"<p>Electroencephalograms (EEG)-based technology for recognizing emotions has attracted a lot of interest lately. However, there is still work to be done on the efficient fusion of different temporal and spatial features of EEG signals to improve performance in emotion recognition. Therefore, this study suggests a new deep learning architecture that combines a time–frequency convolutional neural network (TFCNN), a bidirectional gated recurrent unit (BiGRU), and a self-attention mechanism (SAM) to categorize emotions based on EEG signals and automatically extract features. The first step is to use the continuous wavelet transform (CWT), which responds more readily to temporal frequency variations within EEG recordings, as a layer inside the convolutional layers, to create 2D scalogram images from EEG signals for time series and spatial representation learning. Second, to encode more discriminative features representing emotions, two-dimensional (2D)-CNN, BiGRU, and SAM are trained on these scalograms simultaneously to capture the appropriate information from spatial, local, temporal, and global aspects. Finally, EEG signals are categorized into several emotional states. This network can learn the temporal dependencies of EEG emotion signals with BiGRU, extract local spatial features with TFCNN, and improve recognition accuracy with SAM, which is applied to explore global signal correlations by reassigning weights to emotion features. Using the SEED and GAMEEMO datasets, the suggested strategy was evaluated on three different classification tasks: one with two target classes (positive and negative), one with three target classes (positive, neutral, and negative), and one with four target classes (boring, calm, horror, and funny). Based on the comprehensive results of the experiments, the suggested approach achieved a 93.1%, 96.2%, and 92.9% emotion detection accuracy in two, three, and four classes, respectively, which are 0.281%, 1.98%, and 2.57% higher than the existing approaches working on the same datasets for different subjects, respectively. The open source codes are available at https://www.mathworks.com/matlabcentral/fileexchange/165126-tfcnn-bigru</p>","PeriodicalId":501576,"journal":{"name":"Cluster Computing","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10586-024-04590-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Electroencephalograms (EEG)-based technology for recognizing emotions has attracted a lot of interest lately. However, there is still work to be done on the efficient fusion of different temporal and spatial features of EEG signals to improve performance in emotion recognition. Therefore, this study suggests a new deep learning architecture that combines a time–frequency convolutional neural network (TFCNN), a bidirectional gated recurrent unit (BiGRU), and a self-attention mechanism (SAM) to categorize emotions based on EEG signals and automatically extract features. The first step is to use the continuous wavelet transform (CWT), which responds more readily to temporal frequency variations within EEG recordings, as a layer inside the convolutional layers, to create 2D scalogram images from EEG signals for time series and spatial representation learning. Second, to encode more discriminative features representing emotions, two-dimensional (2D)-CNN, BiGRU, and SAM are trained on these scalograms simultaneously to capture the appropriate information from spatial, local, temporal, and global aspects. Finally, EEG signals are categorized into several emotional states. This network can learn the temporal dependencies of EEG emotion signals with BiGRU, extract local spatial features with TFCNN, and improve recognition accuracy with SAM, which is applied to explore global signal correlations by reassigning weights to emotion features. Using the SEED and GAMEEMO datasets, the suggested strategy was evaluated on three different classification tasks: one with two target classes (positive and negative), one with three target classes (positive, neutral, and negative), and one with four target classes (boring, calm, horror, and funny). Based on the comprehensive results of the experiments, the suggested approach achieved a 93.1%, 96.2%, and 92.9% emotion detection accuracy in two, three, and four classes, respectively, which are 0.281%, 1.98%, and 2.57% higher than the existing approaches working on the same datasets for different subjects, respectively. The open source codes are available at https://www.mathworks.com/matlabcentral/fileexchange/165126-tfcnn-bigru

Abstract Image

查看原文本刊更多论文

具有自我关注机制的 TFCNN-BiGRU，利用多通道脑电图数据自动识别人类情绪

最近，基于脑电图（EEG）的情绪识别技术引起了广泛关注。然而，如何有效融合脑电信号的不同时空特征以提高情绪识别性能，仍有许多工作要做。因此，本研究提出了一种新的深度学习架构，将时频卷积神经网络（TFCNN）、双向门控递归单元（BiGRU）和自我注意机制（SAM）结合起来，根据脑电信号对情绪进行分类，并自动提取特征。第一步是使用连续小波变换（CWT）作为卷积层内部的一层，它更容易响应脑电图记录中的时间频率变化，从而从脑电图信号中创建二维扫描图像，用于时间序列和空间表示学习。其次，为了编码代表情绪的更具区分性的特征，二维（2D）-CNN、BiGRU 和 SAM 同时在这些扫描图上进行训练，以捕捉空间、局部、时间和全局方面的适当信息。最后，脑电信号被分为几种情绪状态。该网络可以利用 BiGRU 学习脑电图情绪信号的时间依赖性，利用 TFCNN 提取局部空间特征，并利用 SAM 提高识别准确率。利用 SEED 和 GAMEEMO 数据集，在三个不同的分类任务中对所建议的策略进行了评估：一个是两个目标类别（积极和消极），一个是三个目标类别（积极、中性和消极），还有一个是四个目标类别（无聊、平静、恐怖和滑稽）。根据实验的综合结果，所建议的方法在两类、三类和四类中的情感检测准确率分别达到了 93.1%、96.2% 和 92.9%，比现有的针对不同主题的相同数据集的方法分别高出 0.281%、1.98% 和 2.57%。开放源代码见 https://www.mathworks.com/matlabcentral/fileexchange/165126-tfcnn-bigru

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cluster Computing

自引率

0.00%

发文量