A novel hybrid LSTM-Graph Attention Network for cross-subject analysis on thinking and speaking state using EEG signals

N. Ramkumar, D. Karthika Renuka
{"title":"A novel hybrid LSTM-Graph Attention Network for cross-subject analysis on thinking and speaking state using EEG signals","authors":"N. Ramkumar, D. Karthika Renuka","doi":"10.3233/jifs-233143","DOIUrl":null,"url":null,"abstract":"In recent times, the rapid advancement of deep learning has led to increased interest in utilizing Electroencephalogram (EEG) signals for automatic speech recognition. However, due to the significant variation observed in EEG signals from different individuals, the field of EEG-based speech recognition faces challenges related to individual differences across subjects, which ultimately impact recognition performance. In this investigation, a novel approach is proposed for EEG-based speech recognition that combines the capabilities of Long Short Term Memory (LSTM) and Graph Attention Network (GAT). The LSTM component of the model is designed to process sequential patterns within the data, enabling it to capture temporal dependencies and extract pertinent features. On the other hand, the GAT component exploits the interconnections among data points, which may represent channels, nodes, or features, in the form of a graph. This innovative model not only delves deeper into the connection between connectivity features and thinking as well as speaking states, but also addresses the challenge of individual disparities across subjects. The experimental results showcase the effectiveness of the proposed approach. When considering the thinking state, the average accuracy for single subjects and cross-subject are 65.7% and 67.3% respectively. Similarly, for the speaking state, the average accuracies were 65.4% for single subjects and 67.4% for cross-subject conditions, all based on the KaraOne dataset. These outcomes highlight the model’s positive impact on the task of cross-subject EEG speech recognition. The motivations for conducting cross subject are real world applicability, Generalization, Adaptation and personalization and performance evaluation.","PeriodicalId":509313,"journal":{"name":"Journal of Intelligent & Fuzzy Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Fuzzy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jifs-233143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In recent times, the rapid advancement of deep learning has led to increased interest in utilizing Electroencephalogram (EEG) signals for automatic speech recognition. However, due to the significant variation observed in EEG signals from different individuals, the field of EEG-based speech recognition faces challenges related to individual differences across subjects, which ultimately impact recognition performance. In this investigation, a novel approach is proposed for EEG-based speech recognition that combines the capabilities of Long Short Term Memory (LSTM) and Graph Attention Network (GAT). The LSTM component of the model is designed to process sequential patterns within the data, enabling it to capture temporal dependencies and extract pertinent features. On the other hand, the GAT component exploits the interconnections among data points, which may represent channels, nodes, or features, in the form of a graph. This innovative model not only delves deeper into the connection between connectivity features and thinking as well as speaking states, but also addresses the challenge of individual disparities across subjects. The experimental results showcase the effectiveness of the proposed approach. When considering the thinking state, the average accuracy for single subjects and cross-subject are 65.7% and 67.3% respectively. Similarly, for the speaking state, the average accuracies were 65.4% for single subjects and 67.4% for cross-subject conditions, all based on the KaraOne dataset. These outcomes highlight the model’s positive impact on the task of cross-subject EEG speech recognition. The motivations for conducting cross subject are real world applicability, Generalization, Adaptation and personalization and performance evaluation.
利用脑电信号对思维和说话状态进行跨主体分析的新型 LSTM-Graph 注意力混合网络
近来,随着深度学习的快速发展,人们对利用脑电图(EEG)信号进行自动语音识别的兴趣与日俱增。然而,由于不同个体的脑电信号存在显著差异,基于脑电图的语音识别领域面临着与受试者个体差异有关的挑战,这些差异最终会影响识别性能。本研究提出了一种基于脑电图的语音识别新方法,该方法结合了长短期记忆(LSTM)和图形注意网络(GAT)的功能。该模型的 LSTM 部分旨在处理数据中的顺序模式,使其能够捕捉时间依赖性并提取相关特征。另一方面,GAT 组件以图的形式利用数据点之间的相互联系,这些数据点可能代表通道、节点或特征。这一创新模型不仅深入研究了连接特征与思维和说话状态之间的联系,还解决了不同受试者个体差异的难题。实验结果展示了所提方法的有效性。在思维状态下,单个受试者和跨受试者的平均准确率分别为 65.7% 和 67.3%。同样,基于 KaraOne 数据集,在说话状态下,单主体和跨主体的平均准确率分别为 65.4% 和 67.4%。这些结果凸显了该模型对跨主体脑电图语音识别任务的积极影响。进行跨主体研究的动机是现实世界的适用性、通用性、适应性和个性化以及性能评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信