Identification of perceived sentences using deep neural networks in EEG.

Carlos Valle, Carolina Mendez-Orellana, Christian Herff, Maria Rodriguez-Fernandez
{"title":"Identification of perceived sentences using deep neural networks in EEG.","authors":"Carlos Valle, Carolina Mendez-Orellana, Christian Herff, Maria Rodriguez-Fernandez","doi":"10.1088/1741-2552/ad88a3","DOIUrl":null,"url":null,"abstract":"<p><p><i>Objetive</i>. Decoding speech from brain activity can enable communication for individuals with speech disorders. Deep neural networks (DNNs) have shown great potential for speech decoding applications. However, the limited availability of large datasets containing neural recordings from speech-impaired subjects poses a challenge. Leveraging data from healthy participants can mitigate this limitation and expedite the development of speech neuroprostheses while minimizing the need for patient-specific training data.<i>Approach</i>. In this study, we collected a substantial dataset consisting of recordings from 56 healthy participants using 64 EEG channels. Multiple neural networks were trained to classify perceived sentences in the Spanish language using subject-independent, mixed-subjects, and fine-tuning approaches. The dataset has been made publicly available to foster further research in this area.<i>Main results</i>. Our results demonstrate a remarkable level of accuracy in distinguishing sentence identity across 30 classes, showcasing the feasibility of training DNNs to decode sentence identity from perceived speech using EEG. Notably, the subject-independent approach rendered accuracy comparable to the mixed-subjects approach, although with higher variability among subjects. Additionally, our fine-tuning approach yielded even higher accuracy, indicating an improved capability to adapt to individual subject characteristics, which enhances performance. This suggests that DNNs have effectively learned to decode universal features of brain activity across individuals while also being adaptable to specific participant data. Furthermore, our analyses indicate that EEGNet and DeepConvNet exhibit comparable performance, outperforming ShallowConvNet for sentence identity decoding. Finally, our Grad-CAM visualization analysis identifies key areas influencing the network's predictions, offering valuable insights into the neural processes underlying language perception and comprehension.<i>Significance</i>. These findings advance our understanding of EEG-based speech perception decoding and hold promise for the development of speech neuroprostheses, particularly in scenarios where subjects cannot provide their own training data.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neural engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1741-2552/ad88a3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objetive. Decoding speech from brain activity can enable communication for individuals with speech disorders. Deep neural networks (DNNs) have shown great potential for speech decoding applications. However, the limited availability of large datasets containing neural recordings from speech-impaired subjects poses a challenge. Leveraging data from healthy participants can mitigate this limitation and expedite the development of speech neuroprostheses while minimizing the need for patient-specific training data.Approach. In this study, we collected a substantial dataset consisting of recordings from 56 healthy participants using 64 EEG channels. Multiple neural networks were trained to classify perceived sentences in the Spanish language using subject-independent, mixed-subjects, and fine-tuning approaches. The dataset has been made publicly available to foster further research in this area.Main results. Our results demonstrate a remarkable level of accuracy in distinguishing sentence identity across 30 classes, showcasing the feasibility of training DNNs to decode sentence identity from perceived speech using EEG. Notably, the subject-independent approach rendered accuracy comparable to the mixed-subjects approach, although with higher variability among subjects. Additionally, our fine-tuning approach yielded even higher accuracy, indicating an improved capability to adapt to individual subject characteristics, which enhances performance. This suggests that DNNs have effectively learned to decode universal features of brain activity across individuals while also being adaptable to specific participant data. Furthermore, our analyses indicate that EEGNet and DeepConvNet exhibit comparable performance, outperforming ShallowConvNet for sentence identity decoding. Finally, our Grad-CAM visualization analysis identifies key areas influencing the network's predictions, offering valuable insights into the neural processes underlying language perception and comprehension.Significance. These findings advance our understanding of EEG-based speech perception decoding and hold promise for the development of speech neuroprostheses, particularly in scenarios where subjects cannot provide their own training data.

利用脑电图中的深度神经网络识别感知句子。
目标从大脑活动中解码语音可以帮助有语言障碍的人进行交流。深度神经网络在语音解码应用方面展现出巨大潜力。然而,包含语言障碍受试者神经记录的大型数据集的可用性有限,这构成了一项挑战。利用健康参与者的数据可以缓解这一限制,加快语音神经义肢的开发,同时最大限度地减少对特定患者训练数据的需求。在这项研究中,我们收集了大量数据集,包括 56 名健康参与者使用 64 个脑电图通道的记录。我们使用独立于主体、混合主体和微调方法对多个神经网络进行了训练,以对西班牙语中的感知句子进行分类。该数据集已公开发布,以促进该领域的进一步研究。我们的结果表明,在区分 30 个类别的句子身份方面,我们的准确性达到了很高的水平,这展示了利用脑电图训练深度神经网络(DNN)从感知语音中解码句子身份的可行性。值得注意的是,与受试者无关的方法与混合受试者方法的准确性相当,但受试者之间的差异更大。此外,我们的微调方法还获得了更高的准确率,这表明我们适应个别受试者特征的能力得到了提高,从而提高了性能。这表明,DNN 已经有效地学会了解码不同个体大脑活动的普遍特征,同时也能适应特定的参与者数据。此外,我们的分析表明,EEGNet 和 DeepConvNet 的性能相当,在句子身份解码方面优于 ShallowConvNet。最后,我们的 Grad-CAM 可视化分析确定了影响网络预测的关键区域,为语言感知和理解的神经过程提供了宝贵的见解。这些发现加深了我们对基于脑电图的语音感知解码的理解,为语音神经义肢的开发带来了希望,尤其是在受试者无法提供自己的训练数据的情况下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信