Decoding articulatory and phonetic components of naturalistic continuous speech from the distributed language network.

IF 3.8 3区医学 Q2 ENGINEERING, BIOMEDICAL

Journal of neural engineering Pub Date : 2023-08-14 DOI:10.1088/1741-2552/ace9fb

Tessy M Thomas, Aditya Singh, Latané P Bullock, Daniel Liang, Cale W Morse, Xavier Scherschligt, John P Seymour, Nitin Tandon

{"title":"Decoding articulatory and phonetic components of naturalistic continuous speech from the distributed language network.","authors":"Tessy M Thomas, Aditya Singh, Latané P Bullock, Daniel Liang, Cale W Morse, Xavier Scherschligt, John P Seymour, Nitin Tandon","doi":"10.1088/1741-2552/ace9fb","DOIUrl":null,"url":null,"abstract":"Objective.The speech production network relies on a widely distributed brain network. However, research and development of speech brain-computer interfaces (speech-BCIs) has typically focused on decoding speech only from superficial subregions readily accessible by subdural grid arrays-typically placed over the sensorimotor cortex. Alternatively, the technique of stereo-electroencephalography (sEEG) enables access to distributed brain regions using multiple depth electrodes with lower surgical risks, especially in patients with brain injuries resulting in aphasia and other speech disorders.Approach.To investigate the decoding potential of widespread electrode coverage in multiple cortical sites, we used a naturalistic continuous speech production task. We obtained neural recordings using sEEG from eight participants while they read aloud sentences. We trained linear classifiers to decode distinct speech components (articulatory components and phonemes) solely based on broadband gamma activity and evaluated the decoding performance using nested five-fold cross-validation.Main Results.We achieved an average classification accuracy of 18.7% across 9 places of articulation (e.g. bilabials, palatals), 26.5% across 5 manner of articulation (MOA) labels (e.g. affricates, fricatives), and 4.81% across 38 phonemes. The highest classification accuracies achieved with a single large dataset were 26.3% for place of articulation, 35.7% for MOA, and 9.88% for phonemes. Electrodes that contributed high decoding power were distributed across multiple sulcal and gyral sites in both dominant and non-dominant hemispheres, including ventral sensorimotor, inferior frontal, superior temporal, and fusiform cortices. Rather than finding a distinct cortical locus for each speech component, we observed neural correlates of both articulatory and phonetic components in multiple hubs of a widespread language production network.Significance.These results reveal the distributed cortical representations whose activity can enable decoding speech components during continuous speech through the use of this minimally invasive recording method, elucidating language neurobiology and neural targets for future speech-BCIs.","PeriodicalId":16753,"journal":{"name":"Journal of neural engineering","volume":"20 4","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neural engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1741-2552/ace9fb","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Objective.The speech production network relies on a widely distributed brain network. However, research and development of speech brain-computer interfaces (speech-BCIs) has typically focused on decoding speech only from superficial subregions readily accessible by subdural grid arrays-typically placed over the sensorimotor cortex. Alternatively, the technique of stereo-electroencephalography (sEEG) enables access to distributed brain regions using multiple depth electrodes with lower surgical risks, especially in patients with brain injuries resulting in aphasia and other speech disorders.Approach.To investigate the decoding potential of widespread electrode coverage in multiple cortical sites, we used a naturalistic continuous speech production task. We obtained neural recordings using sEEG from eight participants while they read aloud sentences. We trained linear classifiers to decode distinct speech components (articulatory components and phonemes) solely based on broadband gamma activity and evaluated the decoding performance using nested five-fold cross-validation.Main Results.We achieved an average classification accuracy of 18.7% across 9 places of articulation (e.g. bilabials, palatals), 26.5% across 5 manner of articulation (MOA) labels (e.g. affricates, fricatives), and 4.81% across 38 phonemes. The highest classification accuracies achieved with a single large dataset were 26.3% for place of articulation, 35.7% for MOA, and 9.88% for phonemes. Electrodes that contributed high decoding power were distributed across multiple sulcal and gyral sites in both dominant and non-dominant hemispheres, including ventral sensorimotor, inferior frontal, superior temporal, and fusiform cortices. Rather than finding a distinct cortical locus for each speech component, we observed neural correlates of both articulatory and phonetic components in multiple hubs of a widespread language production network.Significance.These results reveal the distributed cortical representations whose activity can enable decoding speech components during continuous speech through the use of this minimally invasive recording method, elucidating language neurobiology and neural targets for future speech-BCIs.

查看原文本刊更多论文

从分布式语言网络中解码自然连续语音的发音和语音成分。

目标。语音产生网络依赖于广泛分布的大脑网络。然而，语言脑机接口（speech- bci）的研究和发展通常只关注于通过硬脑膜下网格阵列（通常放置在感觉运动皮层上）易于访问的浅表亚区解码语音。另外，立体脑电图（sEEG）技术可以使用多个深度电极进入分布的大脑区域，手术风险较低，特别是在脑损伤导致失语和其他语言障碍的患者中。方法：为了研究在多个皮质部位广泛覆盖的电极的解码潜力，我们使用了一个自然的连续语音产生任务。我们使用sEEG获得了8名参与者大声朗读句子时的神经记录。我们训练线性分类器仅基于宽带伽马活动解码不同的语音成分（发音成分和音素），并使用嵌套的五倍交叉验证评估解码性能。主要的结果。我们在9个发音位置（如双音、上颚音）上实现了18.7%的平均分类准确率，在5种发音方式（MOA）标签上实现了26.5%的平均分类准确率，在38个音素上实现了4.81%的平均分类准确率。单个大型数据集的最高分类准确率为发音位置26.3%,MOA 35.7%，音素9.88%。高解码能力的电极分布在优势半球和非优势半球的多个脑沟和脑回部位，包括腹侧感觉运动皮层、额叶下皮层、颞叶上皮层和梭状回皮层。我们并没有为每个语音成分找到一个独特的皮质位点，而是在广泛的语言产生网络的多个枢纽中观察到发音和语音成分的神经关联。意义：这些结果揭示了分布式皮质表征，其活动可以通过使用这种微创记录方法在连续语音中解码语音成分，阐明了语言神经生物学和未来语音脑机接口的神经目标。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of neural engineering 工程技术-工程：生物医学

CiteScore

7.80

自引率

12.50%

发文量

319

审稿时长

4.2 months

期刊介绍： The goal of Journal of Neural Engineering (JNE) is to act as a forum for the interdisciplinary field of neural engineering where neuroscientists, neurobiologists and engineers can publish their work in one periodical that bridges the gap between neuroscience and engineering. The journal publishes articles in the field of neural engineering at the molecular, cellular and systems levels. The scope of the journal encompasses experimental, computational, theoretical, clinical and applied aspects of: Innovative neurotechnology; Brain-machine (computer) interface; Neural interfacing; Bioelectronic medicines; Neuromodulation; Neural prostheses; Neural control; Neuro-rehabilitation; Neurorobotics; Optical neural engineering; Neural circuits: artificial & biological; Neuromorphic engineering; Neural tissue regeneration; Neural signal processing; Theoretical and computational neuroscience; Systems neuroscience; Translational neuroscience; Neuroimaging.