Decoding speech sounds from neurophysiological data: Practical considerations and theoretical implications.

Psychophysiology Pub Date : 2024-04-01 Epub Date: 2023-11-10 DOI:10.1111/psyp.14475

McCall E Sarrett, Joseph C Toscano

{"title":"Decoding speech sounds from neurophysiological data: Practical considerations and theoretical implications.","authors":"McCall E Sarrett, Joseph C Toscano","doi":"10.1111/psyp.14475","DOIUrl":null,"url":null,"abstract":"<p><p>Machine learning techniques have proven to be a useful tool in cognitive neuroscience. However, their implementation in scalp-recorded electroencephalography (EEG) is relatively limited. To address this, we present three analyses using data from a previous study that examined event-related potential (ERP) responses to a wide range of naturally-produced speech sounds. First, we explore which features of the EEG signal best maximize machine learning accuracy for a voicing distinction, using a support vector machine (SVM). We manipulate three dimensions of the EEG signal as input to the SVM: number of trials averaged, number of time points averaged, and polynomial fit. We discuss the trade-offs in using different feature sets and offer some recommendations for researchers using machine learning. Next, we use SVMs to classify specific pairs of phonemes, finding that we can detect differences in the EEG signal that are not otherwise detectable using conventional ERP analyses. Finally, we characterize the timecourse of phonetic feature decoding across three phonological dimensions (voicing, manner of articulation, and place of articulation), and find that voicing and manner are decodable from neural activity, whereas place of articulation is not. This set of analyses addresses both practical considerations in the application of machine learning to EEG, particularly for speech studies, and also sheds light on current issues regarding the nature of perceptual representations of speech.</p>","PeriodicalId":94182,"journal":{"name":"Psychophysiology","volume":" ","pages":"e14475"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychophysiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/psyp.14475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/10 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Machine learning techniques have proven to be a useful tool in cognitive neuroscience. However, their implementation in scalp-recorded electroencephalography (EEG) is relatively limited. To address this, we present three analyses using data from a previous study that examined event-related potential (ERP) responses to a wide range of naturally-produced speech sounds. First, we explore which features of the EEG signal best maximize machine learning accuracy for a voicing distinction, using a support vector machine (SVM). We manipulate three dimensions of the EEG signal as input to the SVM: number of trials averaged, number of time points averaged, and polynomial fit. We discuss the trade-offs in using different feature sets and offer some recommendations for researchers using machine learning. Next, we use SVMs to classify specific pairs of phonemes, finding that we can detect differences in the EEG signal that are not otherwise detectable using conventional ERP analyses. Finally, we characterize the timecourse of phonetic feature decoding across three phonological dimensions (voicing, manner of articulation, and place of articulation), and find that voicing and manner are decodable from neural activity, whereas place of articulation is not. This set of analyses addresses both practical considerations in the application of machine learning to EEG, particularly for speech studies, and also sheds light on current issues regarding the nature of perceptual representations of speech.

查看原文本刊更多论文

从神经生理学数据中解码语音：实践考虑和理论启示。

机器学习技术已被证明是认知神经科学中一种有用的工具。然而，它们在头皮记录脑电图（EEG）中的实现相对有限。为了解决这一问题，我们使用之前一项研究的数据进行了三项分析，该研究考察了对各种自然产生的语音的事件相关电位（ERP）反应。首先，我们使用支持向量机（SVM）来探索EEG信号的哪些特征最能最大限度地提高语音区分的机器学习精度。我们操纵EEG信号的三个维度作为SVM的输入：平均试验次数、平均时间点次数和多项式拟合。我们讨论了使用不同特征集的权衡，并为使用机器学习的研究人员提供了一些建议。接下来，我们使用SVM对特定的音素对进行分类，发现我们可以检测EEG信号中使用传统ERP分析无法检测到的差异。最后，我们在三个语音维度（发音、发音方式和发音地点）上表征了语音特征解码的时间过程，并发现发音和发音方式可以从神经活动中解码，而发音地点则不能。这组分析既解决了将机器学习应用于EEG（特别是语音研究）的实际考虑，也揭示了当前关于语音感知表征性质的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Psychophysiology

自引率

0.00%

发文量