{"title":"A novel hybrid LSTM-Graph Attention Network for cross-subject analysis on thinking and speaking state using EEG signals","authors":"N. Ramkumar, D. Karthika Renuka","doi":"10.3233/jifs-233143","DOIUrl":null,"url":null,"abstract":"In recent times, the rapid advancement of deep learning has led to increased interest in utilizing Electroencephalogram (EEG) signals for automatic speech recognition. However, due to the significant variation observed in EEG signals from different individuals, the field of EEG-based speech recognition faces challenges related to individual differences across subjects, which ultimately impact recognition performance. In this investigation, a novel approach is proposed for EEG-based speech recognition that combines the capabilities of Long Short Term Memory (LSTM) and Graph Attention Network (GAT). The LSTM component of the model is designed to process sequential patterns within the data, enabling it to capture temporal dependencies and extract pertinent features. On the other hand, the GAT component exploits the interconnections among data points, which may represent channels, nodes, or features, in the form of a graph. This innovative model not only delves deeper into the connection between connectivity features and thinking as well as speaking states, but also addresses the challenge of individual disparities across subjects. The experimental results showcase the effectiveness of the proposed approach. When considering the thinking state, the average accuracy for single subjects and cross-subject are 65.7% and 67.3% respectively. Similarly, for the speaking state, the average accuracies were 65.4% for single subjects and 67.4% for cross-subject conditions, all based on the KaraOne dataset. These outcomes highlight the model’s positive impact on the task of cross-subject EEG speech recognition. The motivations for conducting cross subject are real world applicability, Generalization, Adaptation and personalization and performance evaluation.","PeriodicalId":509313,"journal":{"name":"Journal of Intelligent & Fuzzy Systems","volume":" 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Fuzzy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jifs-233143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent times, the rapid advancement of deep learning has led to increased interest in utilizing Electroencephalogram (EEG) signals for automatic speech recognition. However, due to the significant variation observed in EEG signals from different individuals, the field of EEG-based speech recognition faces challenges related to individual differences across subjects, which ultimately impact recognition performance. In this investigation, a novel approach is proposed for EEG-based speech recognition that combines the capabilities of Long Short Term Memory (LSTM) and Graph Attention Network (GAT). The LSTM component of the model is designed to process sequential patterns within the data, enabling it to capture temporal dependencies and extract pertinent features. On the other hand, the GAT component exploits the interconnections among data points, which may represent channels, nodes, or features, in the form of a graph. This innovative model not only delves deeper into the connection between connectivity features and thinking as well as speaking states, but also addresses the challenge of individual disparities across subjects. The experimental results showcase the effectiveness of the proposed approach. When considering the thinking state, the average accuracy for single subjects and cross-subject are 65.7% and 67.3% respectively. Similarly, for the speaking state, the average accuracies were 65.4% for single subjects and 67.4% for cross-subject conditions, all based on the KaraOne dataset. These outcomes highlight the model’s positive impact on the task of cross-subject EEG speech recognition. The motivations for conducting cross subject are real world applicability, Generalization, Adaptation and personalization and performance evaluation.