Online Ternary Classification of Covert Speech by Leveraging the Passive Perception of Speech.

IF 6.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Neural Systems Pub Date : 2023-09-01 DOI:10.1142/S012906572350048X

Jae Moon, Tom Chau

{"title":"Online Ternary Classification of Covert Speech by Leveraging the Passive Perception of Speech.","authors":"Jae Moon, Tom Chau","doi":"10.1142/S012906572350048X","DOIUrl":null,"url":null,"abstract":"<p><p>Brain-computer interfaces (BCIs) provide communicative alternatives to those without functional speech. Covert speech (CS)-based BCIs enable communication simply by thinking of words and thus have intuitive appeal. However, an elusive barrier to their clinical translation is the collection of voluminous examples of high-quality CS signals, as iteratively rehearsing words for long durations is mentally fatiguing. Research on CS and speech perception (SP) identifies common spatiotemporal patterns in their respective electroencephalographic (EEG) signals, pointing towards shared encoding mechanisms. The goal of this study was to investigate whether a model that leverages the signal similarities between SP and CS can differentiate speech-related EEG signals online. Ten participants completed a dyadic protocol where in each trial, they listened to a randomly selected word and then subsequently mentally rehearsed the word. In the offline sessions, eight words were presented to participants. For the subsequent online sessions, the two most distinct words (most separable in terms of their EEG signals) were chosen to form a ternary classification problem (two words and rest). The model comprised a functional mapping derived from SP and CS signals of the same speech token (features are extracted via a Riemannian approach). An average ternary online accuracy of 75.3% (60% chance level) was achieved across participants, with individual accuracies as high as 93%. Moreover, we observed that the signal-to-noise ratio (SNR) of CS signals was enhanced by perception-covert modeling according to the level of high-frequency ([Formula: see text]-band) correspondence between CS and SP. These findings may lead to less burdensome data collection for training speech BCIs, which could eventually enhance the rate at which the vocabulary can grow.</p>","PeriodicalId":50305,"journal":{"name":"International Journal of Neural Systems","volume":"33 9","pages":"2350048"},"PeriodicalIF":6.4000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Neural Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1142/S012906572350048X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Brain-computer interfaces (BCIs) provide communicative alternatives to those without functional speech. Covert speech (CS)-based BCIs enable communication simply by thinking of words and thus have intuitive appeal. However, an elusive barrier to their clinical translation is the collection of voluminous examples of high-quality CS signals, as iteratively rehearsing words for long durations is mentally fatiguing. Research on CS and speech perception (SP) identifies common spatiotemporal patterns in their respective electroencephalographic (EEG) signals, pointing towards shared encoding mechanisms. The goal of this study was to investigate whether a model that leverages the signal similarities between SP and CS can differentiate speech-related EEG signals online. Ten participants completed a dyadic protocol where in each trial, they listened to a randomly selected word and then subsequently mentally rehearsed the word. In the offline sessions, eight words were presented to participants. For the subsequent online sessions, the two most distinct words (most separable in terms of their EEG signals) were chosen to form a ternary classification problem (two words and rest). The model comprised a functional mapping derived from SP and CS signals of the same speech token (features are extracted via a Riemannian approach). An average ternary online accuracy of 75.3% (60% chance level) was achieved across participants, with individual accuracies as high as 93%. Moreover, we observed that the signal-to-noise ratio (SNR) of CS signals was enhanced by perception-covert modeling according to the level of high-frequency ([Formula: see text]-band) correspondence between CS and SP. These findings may lead to less burdensome data collection for training speech BCIs, which could eventually enhance the rate at which the vocabulary can grow.

查看原文本刊更多论文

利用被动语音感知的隐蔽语音在线三元分类。

脑机接口(bci)为那些没有功能语言的人提供了交流的选择。基于隐语的脑机接口(bci)可以简单地通过思考词语进行交流，因此具有直观的吸引力。然而，临床翻译的一个难以捉摸的障碍是收集大量高质量CS信号的例子，因为长时间反复排练单词会使人精神疲劳。对CS和语音感知(SP)的研究发现了它们各自脑电图(EEG)信号中共同的时空模式，指向了共同的编码机制。本研究的目的是探讨利用SP和CS之间信号相似性的模型是否可以在线区分语音相关的EEG信号。10名参与者完成了一个二元方案，在每次试验中，他们听一个随机选择的单词，然后在脑海中排练这个单词。在线下环节，向参与者展示了8个单词。在随后的在线会话中，选择两个最明显的词(就其脑电图信号而言最可分离)形成一个三元分类问题(两个词和休息)。该模型包括由相同语音标记的SP和CS信号衍生的功能映射(通过黎曼方法提取特征)。参与者的平均三元在线准确率为75.3%(60%的机会水平)，个体准确率高达93%。此外，我们观察到，根据CS和SP之间的高频([公式:见文本]-频带)对应程度，感知隐蔽建模可以提高CS信号的信噪比(SNR)。这些发现可能会减少训练语音bci的数据收集负担，从而最终提高词汇量的增长速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Neural Systems 工程技术-计算机：人工智能

CiteScore

11.30

自引率

28.80%

发文量

116

审稿时长

24 months

期刊介绍： The International Journal of Neural Systems is a monthly, rigorously peer-reviewed transdisciplinary journal focusing on information processing in both natural and artificial neural systems. Special interests include machine learning, computational neuroscience and neurology. The journal prioritizes innovative, high-impact articles spanning multiple fields, including neurosciences and computer science and engineering. It adopts an open-minded approach to this multidisciplinary field, serving as a platform for novel ideas and enhanced understanding of collective and cooperative phenomena in computationally capable systems.