{"title":"一种基于脑电图-肌电图的脑机混合接口,用于解码无声和可听语音中的音调。","authors":"Jiawei Ju, Yifan Zhuang, Chunzhi Yi","doi":"10.1109/TNSRE.2025.3616276","DOIUrl":null,"url":null,"abstract":"<p><p>Speech recognition can be widely applied to support people with language disabilities by enabling them to communicate through brain-computer interfaces (BCIs), thus improving their quality of life. Despite the essential role of tonal variations in conveying semantic meaning, there have been limited studies focusing on the neural signatures of tones and their decoding. This paper systematically investigates the neural signatures of the four tones of Mandarin. It explores the feasibility of tone decoding in both silent and audible speech using a multimodal BCI based on electroencephalography (EEG) and electromyography (EMG). The time-frequency analysis of EEG has revealed significant variations in neural activation patterns across various tones and speech modes. For example, in the silent speech condition, temporal-domain analysis shows significant tone-dependent activation in the frontal lobe (ANOVA p = 0.000, Tone1 vs Tone2: p = 0.000, Tone1 vs Tone4: p = 0.000, Tone2 vs Tone3: p = 0.000, Tone3 vs Tone4: p = 0.001) and in channel F8 (ANOVA p=0.008, Tone1 vs Tone2: p=0.014, Tone2 vs Tone3: p=0.034). Spectral analysis shows significant differences between four tones in event-related spectral perturbation (ERSP) in the central region (p = 0.000) and channel C6 (p = 0.000). EMG analysis identifies a significant tone-related difference in activation of the left buccinator muscle (p = 0.023), and ERSP from the mentalis muscle also shows a marked difference across tones in both speech conditions (p = 0.00). Overall, tone-related neural differences were more pronounced in the audible speech condition than in the silent condition. For tone classification, RLDA and SVM classifiers achieved accuracies of 71.22% and 72.43%, respectively, using EEG temporal features in both speech modes. Additionally, the RLDA classifier with temporal features achieves binary tone classification accuracies of 90.92% (audible tones) and 91.00% (silent tones). The combination of EEG and EMG yields the highest speech modes decoding accuracy of 81.33%. These findings provide a potential strategy for speech restoration in tonal languages and further validate the feasibility of a speech brain-computer interface (BCI) as a clinically effective treatment for individuals with tonal language impairment.</p>","PeriodicalId":13419,"journal":{"name":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","volume":"PP ","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An EEG-EMG-based Hybrid Brain-Computer Interface for Decoding Tones in Silent and Audible Speech.\",\"authors\":\"Jiawei Ju, Yifan Zhuang, Chunzhi Yi\",\"doi\":\"10.1109/TNSRE.2025.3616276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Speech recognition can be widely applied to support people with language disabilities by enabling them to communicate through brain-computer interfaces (BCIs), thus improving their quality of life. Despite the essential role of tonal variations in conveying semantic meaning, there have been limited studies focusing on the neural signatures of tones and their decoding. This paper systematically investigates the neural signatures of the four tones of Mandarin. It explores the feasibility of tone decoding in both silent and audible speech using a multimodal BCI based on electroencephalography (EEG) and electromyography (EMG). The time-frequency analysis of EEG has revealed significant variations in neural activation patterns across various tones and speech modes. For example, in the silent speech condition, temporal-domain analysis shows significant tone-dependent activation in the frontal lobe (ANOVA p = 0.000, Tone1 vs Tone2: p = 0.000, Tone1 vs Tone4: p = 0.000, Tone2 vs Tone3: p = 0.000, Tone3 vs Tone4: p = 0.001) and in channel F8 (ANOVA p=0.008, Tone1 vs Tone2: p=0.014, Tone2 vs Tone3: p=0.034). Spectral analysis shows significant differences between four tones in event-related spectral perturbation (ERSP) in the central region (p = 0.000) and channel C6 (p = 0.000). EMG analysis identifies a significant tone-related difference in activation of the left buccinator muscle (p = 0.023), and ERSP from the mentalis muscle also shows a marked difference across tones in both speech conditions (p = 0.00). Overall, tone-related neural differences were more pronounced in the audible speech condition than in the silent condition. For tone classification, RLDA and SVM classifiers achieved accuracies of 71.22% and 72.43%, respectively, using EEG temporal features in both speech modes. Additionally, the RLDA classifier with temporal features achieves binary tone classification accuracies of 90.92% (audible tones) and 91.00% (silent tones). The combination of EEG and EMG yields the highest speech modes decoding accuracy of 81.33%. These findings provide a potential strategy for speech restoration in tonal languages and further validate the feasibility of a speech brain-computer interface (BCI) as a clinically effective treatment for individuals with tonal language impairment.</p>\",\"PeriodicalId\":13419,\"journal\":{\"name\":\"IEEE Transactions on Neural Systems and Rehabilitation Engineering\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Neural Systems and Rehabilitation Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/TNSRE.2025.3616276\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/TNSRE.2025.3616276","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
摘要
语音识别可以广泛应用于语言障碍人士,使他们能够通过脑机接口(bci)进行交流,从而改善他们的生活质量。尽管声调变化在语义传递中起着重要的作用,但对声调的神经特征及其解码的研究还很有限。本文系统地研究了普通话四个声调的神经特征。它探讨了使用基于脑电图(EEG)和肌电图(EMG)的多模态脑机接口在无声和可听语音中解码音调的可行性。脑电图的时频分析揭示了不同音调和语音模式下神经激活模式的显著差异。例如,在沉默的情况下,时域分析显示在额叶(方差分析p= 0.000, Tone1 vs Tone2: p= 0.000, Tone1 vs Tone4: p= 0.000, Tone2 vs Tone3: p= 0.000, Tone3 vs Tone4: p= 0.001)和通道F8(方差分析p=0.008, Tone1 vs Tone2: p=0.014, Tone2 vs Tone3: p=0.034)显著的音调依赖性激活。光谱分析显示,4个音调在中心区(p = 0.000)和C6通道(p = 0.000)的事件相关谱摄动(ERSP)上存在显著差异。肌电图分析发现,左颊肌的激活有显著的音调相关差异(p = 0.023),精神肌的ERSP也显示了两种情况下音调之间的显著差异(p = 0.00)。总的来说,声调相关的神经差异在可听到的情况下比在沉默的情况下更为明显。在两种语音模式下,RLDA和SVM分类器分别利用脑电时间特征进行音调分类,准确率分别达到71.22%和72.43%。此外,具有时间特征的RLDA分类器的二元音调分类准确率分别为90.92%(可听音调)和91.00%(无声音调)。脑电与肌电相结合的语音解码准确率最高,达到81.33%。这些发现为声调语言的语音恢复提供了一种潜在的策略,并进一步验证了语音脑机接口(BCI)作为声调语言障碍患者临床有效治疗的可行性。
An EEG-EMG-based Hybrid Brain-Computer Interface for Decoding Tones in Silent and Audible Speech.
Speech recognition can be widely applied to support people with language disabilities by enabling them to communicate through brain-computer interfaces (BCIs), thus improving their quality of life. Despite the essential role of tonal variations in conveying semantic meaning, there have been limited studies focusing on the neural signatures of tones and their decoding. This paper systematically investigates the neural signatures of the four tones of Mandarin. It explores the feasibility of tone decoding in both silent and audible speech using a multimodal BCI based on electroencephalography (EEG) and electromyography (EMG). The time-frequency analysis of EEG has revealed significant variations in neural activation patterns across various tones and speech modes. For example, in the silent speech condition, temporal-domain analysis shows significant tone-dependent activation in the frontal lobe (ANOVA p = 0.000, Tone1 vs Tone2: p = 0.000, Tone1 vs Tone4: p = 0.000, Tone2 vs Tone3: p = 0.000, Tone3 vs Tone4: p = 0.001) and in channel F8 (ANOVA p=0.008, Tone1 vs Tone2: p=0.014, Tone2 vs Tone3: p=0.034). Spectral analysis shows significant differences between four tones in event-related spectral perturbation (ERSP) in the central region (p = 0.000) and channel C6 (p = 0.000). EMG analysis identifies a significant tone-related difference in activation of the left buccinator muscle (p = 0.023), and ERSP from the mentalis muscle also shows a marked difference across tones in both speech conditions (p = 0.00). Overall, tone-related neural differences were more pronounced in the audible speech condition than in the silent condition. For tone classification, RLDA and SVM classifiers achieved accuracies of 71.22% and 72.43%, respectively, using EEG temporal features in both speech modes. Additionally, the RLDA classifier with temporal features achieves binary tone classification accuracies of 90.92% (audible tones) and 91.00% (silent tones). The combination of EEG and EMG yields the highest speech modes decoding accuracy of 81.33%. These findings provide a potential strategy for speech restoration in tonal languages and further validate the feasibility of a speech brain-computer interface (BCI) as a clinically effective treatment for individuals with tonal language impairment.
期刊介绍:
Rehabilitative and neural aspects of biomedical engineering, including functional electrical stimulation, acoustic dynamics, human performance measurement and analysis, nerve stimulation, electromyography, motor control and stimulation; and hardware and software applications for rehabilitation engineering and assistive devices.