Continuous and discrete decoding of overt speech with scalp electroencephalography (EEG).

Alexander Craik, Heather R Dial, Jose L Contreras-Vidal
{"title":"Continuous and discrete decoding of overt speech with scalp electroencephalography (EEG).","authors":"Alexander Craik, Heather R Dial, Jose L Contreras-Vidal","doi":"10.1088/1741-2552/ad8d0a","DOIUrl":null,"url":null,"abstract":"<p><p>Neurological disorders affecting speech production adversely impact quality of&#xD;life for over 7 million individuals in the US. Traditional speech interfaces like eyetracking&#xD;devices and P300 spellers are slow and unnatural for these patients. An&#xD;alternative solution, speech Brain-Computer Interfaces (BCIs), directly decodes speech&#xD;characteristics, offering a more natural communication mechanism. This research&#xD;explores the feasibility of decoding speech features using non-invasive EEG. Nine&#xD;neurologically intact participants were equipped with a 63-channel EEG system&#xD;with additional sensors to eliminate eye artifacts. Participants read aloud sentences&#xD;displayed on a screen selected for phonetic similarity to the English language. Deep&#xD;learning models, including Convolutional Neural Networks and Recurrent Neural&#xD;Networks with and without attention modules, were optimized with a focus on&#xD;minimizing trainable parameters and utilizing small input window sizes for real-time&#xD;application. These models were employed for discrete and continuous speech decoding&#xD;tasks, achieving statistically significant participant-independent decoding performance&#xD;for discrete classes and continuous characteristics of the produced audio signal. A&#xD;frequency sub-band analysis highlighted the significance of certain frequency bands&#xD;(delta, theta, and gamma) for decoding performance, and a perturbation analysis&#xD;was used to identify crucial channels. Assessed channel selection methods did not&#xD;significantly improve performance, suggesting a distributed representation of speech&#xD;information encoded in the EEG signals. Leave-One-Out training demonstrated&#xD;the feasibility of utilizing common speech neural correlates, reducing data collection&#xD;requirements from individual participants.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neural engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1741-2552/ad8d0a","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Neurological disorders affecting speech production adversely impact quality of life for over 7 million individuals in the US. Traditional speech interfaces like eyetracking devices and P300 spellers are slow and unnatural for these patients. An alternative solution, speech Brain-Computer Interfaces (BCIs), directly decodes speech characteristics, offering a more natural communication mechanism. This research explores the feasibility of decoding speech features using non-invasive EEG. Nine neurologically intact participants were equipped with a 63-channel EEG system with additional sensors to eliminate eye artifacts. Participants read aloud sentences displayed on a screen selected for phonetic similarity to the English language. Deep learning models, including Convolutional Neural Networks and Recurrent Neural Networks with and without attention modules, were optimized with a focus on minimizing trainable parameters and utilizing small input window sizes for real-time application. These models were employed for discrete and continuous speech decoding tasks, achieving statistically significant participant-independent decoding performance for discrete classes and continuous characteristics of the produced audio signal. A frequency sub-band analysis highlighted the significance of certain frequency bands (delta, theta, and gamma) for decoding performance, and a perturbation analysis was used to identify crucial channels. Assessed channel selection methods did not significantly improve performance, suggesting a distributed representation of speech information encoded in the EEG signals. Leave-One-Out training demonstrated the feasibility of utilizing common speech neural correlates, reducing data collection requirements from individual participants.

利用头皮脑电图(EEG)对公开语音进行连续和离散解码。
美国有 700 多万人因神经系统疾病而影响了语言能力,对生活质量造成了不利影响。传统的语音界面,如眼球追踪 设备和 P300 拼写器,对这些患者来说既缓慢又不自然。另一种解决方案--语音脑机接口(BCI)可直接解码语音特征,提供更自然的交流机制。这项研究探索了利用无创脑电图解码语音特征的可行性。九名神经系统完好的参与者配备了 63 通道脑电图系统 ,并增加了传感器以消除眼部伪影。参与者朗读屏幕上显示的与英语语音相似的句子。深度学习模型包括卷积神经网络(Convolutional Neural Networks)和递归神经网络(Recurrent Neural Networks),有注意力模块和无注意力模块。这些模型被用于离散和连续语音解码任务,在离散类和连续特征音频信号的解码性能上取得了显著的与参与者无关的统计效果 。频率子带分析强调了某些频段(delta、theta和gamma)对解码性能的重要性,扰动分析用于识别关键信道。经过评估的信道选择方法并没有明显提高性能,这表明脑电信号中编码的语音信息是分布式的。留空训练证明了利用普通语音神经相关性的可行性,从而减少了对个别参与者的数据收集要求 。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信