Essam H. Houssein, Asmaa Hammad, Nagwan Abdel Samee, Manal Abdullah Alohali, Abdelmgeid A. Ali
{"title":"TFCNN-BiGRU with self-attention mechanism for automatic human emotion recognition using multi-channel EEG data","authors":"Essam H. Houssein, Asmaa Hammad, Nagwan Abdel Samee, Manal Abdullah Alohali, Abdelmgeid A. Ali","doi":"10.1007/s10586-024-04590-5","DOIUrl":null,"url":null,"abstract":"<p>Electroencephalograms (EEG)-based technology for recognizing emotions has attracted a lot of interest lately. However, there is still work to be done on the efficient fusion of different temporal and spatial features of EEG signals to improve performance in emotion recognition. Therefore, this study suggests a new deep learning architecture that combines a time–frequency convolutional neural network (TFCNN), a bidirectional gated recurrent unit (BiGRU), and a self-attention mechanism (SAM) to categorize emotions based on EEG signals and automatically extract features. The first step is to use the continuous wavelet transform (CWT), which responds more readily to temporal frequency variations within EEG recordings, as a layer inside the convolutional layers, to create 2D scalogram images from EEG signals for time series and spatial representation learning. Second, to encode more discriminative features representing emotions, two-dimensional (2D)-CNN, BiGRU, and SAM are trained on these scalograms simultaneously to capture the appropriate information from spatial, local, temporal, and global aspects. Finally, EEG signals are categorized into several emotional states. This network can learn the temporal dependencies of EEG emotion signals with BiGRU, extract local spatial features with TFCNN, and improve recognition accuracy with SAM, which is applied to explore global signal correlations by reassigning weights to emotion features. Using the SEED and GAMEEMO datasets, the suggested strategy was evaluated on three different classification tasks: one with two target classes (positive and negative), one with three target classes (positive, neutral, and negative), and one with four target classes (boring, calm, horror, and funny). Based on the comprehensive results of the experiments, the suggested approach achieved a 93.1%, 96.2%, and 92.9% emotion detection accuracy in two, three, and four classes, respectively, which are 0.281%, 1.98%, and 2.57% higher than the existing approaches working on the same datasets for different subjects, respectively. The open source codes are available at https://www.mathworks.com/matlabcentral/fileexchange/165126-tfcnn-bigru</p>","PeriodicalId":501576,"journal":{"name":"Cluster Computing","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10586-024-04590-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Electroencephalograms (EEG)-based technology for recognizing emotions has attracted a lot of interest lately. However, there is still work to be done on the efficient fusion of different temporal and spatial features of EEG signals to improve performance in emotion recognition. Therefore, this study suggests a new deep learning architecture that combines a time–frequency convolutional neural network (TFCNN), a bidirectional gated recurrent unit (BiGRU), and a self-attention mechanism (SAM) to categorize emotions based on EEG signals and automatically extract features. The first step is to use the continuous wavelet transform (CWT), which responds more readily to temporal frequency variations within EEG recordings, as a layer inside the convolutional layers, to create 2D scalogram images from EEG signals for time series and spatial representation learning. Second, to encode more discriminative features representing emotions, two-dimensional (2D)-CNN, BiGRU, and SAM are trained on these scalograms simultaneously to capture the appropriate information from spatial, local, temporal, and global aspects. Finally, EEG signals are categorized into several emotional states. This network can learn the temporal dependencies of EEG emotion signals with BiGRU, extract local spatial features with TFCNN, and improve recognition accuracy with SAM, which is applied to explore global signal correlations by reassigning weights to emotion features. Using the SEED and GAMEEMO datasets, the suggested strategy was evaluated on three different classification tasks: one with two target classes (positive and negative), one with three target classes (positive, neutral, and negative), and one with four target classes (boring, calm, horror, and funny). Based on the comprehensive results of the experiments, the suggested approach achieved a 93.1%, 96.2%, and 92.9% emotion detection accuracy in two, three, and four classes, respectively, which are 0.281%, 1.98%, and 2.57% higher than the existing approaches working on the same datasets for different subjects, respectively. The open source codes are available at https://www.mathworks.com/matlabcentral/fileexchange/165126-tfcnn-bigru