利用脑电信号对思维和说话状态进行跨主体分析的新型 LSTM-Graph 注意力混合网络

Journal of Intelligent & Fuzzy Systems Pub Date : 2024-03-22 DOI:10.3233/jifs-233143

N. Ramkumar, D. Karthika Renuka

{"title":"利用脑电信号对思维和说话状态进行跨主体分析的新型 LSTM-Graph 注意力混合网络","authors":"N. Ramkumar, D. Karthika Renuka","doi":"10.3233/jifs-233143","DOIUrl":null,"url":null,"abstract":"In recent times, the rapid advancement of deep learning has led to increased interest in utilizing Electroencephalogram (EEG) signals for automatic speech recognition. However, due to the significant variation observed in EEG signals from different individuals, the field of EEG-based speech recognition faces challenges related to individual differences across subjects, which ultimately impact recognition performance. In this investigation, a novel approach is proposed for EEG-based speech recognition that combines the capabilities of Long Short Term Memory (LSTM) and Graph Attention Network (GAT). The LSTM component of the model is designed to process sequential patterns within the data, enabling it to capture temporal dependencies and extract pertinent features. On the other hand, the GAT component exploits the interconnections among data points, which may represent channels, nodes, or features, in the form of a graph. This innovative model not only delves deeper into the connection between connectivity features and thinking as well as speaking states, but also addresses the challenge of individual disparities across subjects. The experimental results showcase the effectiveness of the proposed approach. When considering the thinking state, the average accuracy for single subjects and cross-subject are 65.7% and 67.3% respectively. Similarly, for the speaking state, the average accuracies were 65.4% for single subjects and 67.4% for cross-subject conditions, all based on the KaraOne dataset. These outcomes highlight the model’s positive impact on the task of cross-subject EEG speech recognition. The motivations for conducting cross subject are real world applicability, Generalization, Adaptation and personalization and performance evaluation.","PeriodicalId":509313,"journal":{"name":"Journal of Intelligent & Fuzzy Systems","volume":" 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel hybrid LSTM-Graph Attention Network for cross-subject analysis on thinking and speaking state using EEG signals\",\"authors\":\"N. Ramkumar, D. Karthika Renuka\",\"doi\":\"10.3233/jifs-233143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent times, the rapid advancement of deep learning has led to increased interest in utilizing Electroencephalogram (EEG) signals for automatic speech recognition. However, due to the significant variation observed in EEG signals from different individuals, the field of EEG-based speech recognition faces challenges related to individual differences across subjects, which ultimately impact recognition performance. In this investigation, a novel approach is proposed for EEG-based speech recognition that combines the capabilities of Long Short Term Memory (LSTM) and Graph Attention Network (GAT). The LSTM component of the model is designed to process sequential patterns within the data, enabling it to capture temporal dependencies and extract pertinent features. On the other hand, the GAT component exploits the interconnections among data points, which may represent channels, nodes, or features, in the form of a graph. This innovative model not only delves deeper into the connection between connectivity features and thinking as well as speaking states, but also addresses the challenge of individual disparities across subjects. The experimental results showcase the effectiveness of the proposed approach. When considering the thinking state, the average accuracy for single subjects and cross-subject are 65.7% and 67.3% respectively. Similarly, for the speaking state, the average accuracies were 65.4% for single subjects and 67.4% for cross-subject conditions, all based on the KaraOne dataset. These outcomes highlight the model’s positive impact on the task of cross-subject EEG speech recognition. The motivations for conducting cross subject are real world applicability, Generalization, Adaptation and personalization and performance evaluation.\",\"PeriodicalId\":509313,\"journal\":{\"name\":\"Journal of Intelligent & Fuzzy Systems\",\"volume\":\" 6\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Intelligent & Fuzzy Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/jifs-233143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Fuzzy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jifs-233143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近来，随着深度学习的快速发展，人们对利用脑电图（EEG）信号进行自动语音识别的兴趣与日俱增。然而，由于不同个体的脑电信号存在显著差异，基于脑电图的语音识别领域面临着与受试者个体差异有关的挑战，这些差异最终会影响识别性能。本研究提出了一种基于脑电图的语音识别新方法，该方法结合了长短期记忆（LSTM）和图形注意网络（GAT）的功能。该模型的 LSTM 部分旨在处理数据中的顺序模式，使其能够捕捉时间依赖性并提取相关特征。另一方面，GAT 组件以图的形式利用数据点之间的相互联系，这些数据点可能代表通道、节点或特征。这一创新模型不仅深入研究了连接特征与思维和说话状态之间的联系，还解决了不同受试者个体差异的难题。实验结果展示了所提方法的有效性。在思维状态下，单个受试者和跨受试者的平均准确率分别为 65.7% 和 67.3%。同样，基于 KaraOne 数据集，在说话状态下，单主体和跨主体的平均准确率分别为 65.4% 和 67.4%。这些结果凸显了该模型对跨主体脑电图语音识别任务的积极影响。进行跨主体研究的动机是现实世界的适用性、通用性、适应性和个性化以及性能评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A novel hybrid LSTM-Graph Attention Network for cross-subject analysis on thinking and speaking state using EEG signals

In recent times, the rapid advancement of deep learning has led to increased interest in utilizing Electroencephalogram (EEG) signals for automatic speech recognition. However, due to the significant variation observed in EEG signals from different individuals, the field of EEG-based speech recognition faces challenges related to individual differences across subjects, which ultimately impact recognition performance. In this investigation, a novel approach is proposed for EEG-based speech recognition that combines the capabilities of Long Short Term Memory (LSTM) and Graph Attention Network (GAT). The LSTM component of the model is designed to process sequential patterns within the data, enabling it to capture temporal dependencies and extract pertinent features. On the other hand, the GAT component exploits the interconnections among data points, which may represent channels, nodes, or features, in the form of a graph. This innovative model not only delves deeper into the connection between connectivity features and thinking as well as speaking states, but also addresses the challenge of individual disparities across subjects. The experimental results showcase the effectiveness of the proposed approach. When considering the thinking state, the average accuracy for single subjects and cross-subject are 65.7% and 67.3% respectively. Similarly, for the speaking state, the average accuracies were 65.4% for single subjects and 67.4% for cross-subject conditions, all based on the KaraOne dataset. These outcomes highlight the model’s positive impact on the task of cross-subject EEG speech recognition. The motivations for conducting cross subject are real world applicability, Generalization, Adaptation and personalization and performance evaluation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Intelligent & Fuzzy Systems

自引率

0.00%

发文量