Never mind the repeat: How speech expectations reduce tracking at the cocktail party

IF 3.3 2区心理学 Q1 BEHAVIORAL SCIENCES

Cortex Pub Date : 2025-05-20 DOI:10.1016/j.cortex.2025.05.003

Thaiz Sánchez-Costa , Alejandra Carboni , Francisco Cervantes Constantino

{"title":"Never mind the repeat: How speech expectations reduce tracking at the cocktail party","authors":"Thaiz Sánchez-Costa , Alejandra Carboni , Francisco Cervantes Constantino","doi":"10.1016/j.cortex.2025.05.003","DOIUrl":null,"url":null,"abstract":"<div><div>When the brain focuses on a conversation in a noisy environment, it exploits past experience to prioritize relevant elements from the auditory scene. This prompts the question of what changes occur in the selective neural processing of speech mixtures as listeners garner prior experience about single speech objects. In three different priming experiments, we quantified cortical selection of temporal landmarks from continuous speech, applying the temporal response function (TRF) method to single-trial electroencephalography (EEG) recordings. The designs specifically addressed how attention interacts with exact (Experiment 1), voice (Experiment 2a), or message (Experiment 2b) content priming of the target or background speakers in cortical responses to speech. Our results demonstrate that, during multispeaker listening, attentional gains typical of cortical responses under speech selection are met with attenuations as a consequence of prior experience. The changes were observed at the P2 processing stage (220–320 msec) of speech envelope onset processing and were specific to responses to primed speech targets (Experiment 1). Suppressions at stages earlier than the P2, or under partial priming conditions (Experiments 2a and 2b), were not observed. An exploratory analysis suggests the observed P2 reduction predicts listeners' ability to report target words, consistent with this component encoding in part temporal prediction error about onset edge cues exclusive to target speech. Our results show that at this late and definitive stage of selective attention, the auditory system may test the evidence for its own predictive model of the noise-invariant speech stream. Precise inference of its temporal structure is bound to tag all checkpoints where auditory evidence can be most reliably connected into higher-order representations of continuous speech.</div></div>","PeriodicalId":10758,"journal":{"name":"Cortex","volume":"189 ","pages":"Pages 1-19"},"PeriodicalIF":3.3000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cortex","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010945225001248","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BEHAVIORAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

When the brain focuses on a conversation in a noisy environment, it exploits past experience to prioritize relevant elements from the auditory scene. This prompts the question of what changes occur in the selective neural processing of speech mixtures as listeners garner prior experience about single speech objects. In three different priming experiments, we quantified cortical selection of temporal landmarks from continuous speech, applying the temporal response function (TRF) method to single-trial electroencephalography (EEG) recordings. The designs specifically addressed how attention interacts with exact (Experiment 1), voice (Experiment 2a), or message (Experiment 2b) content priming of the target or background speakers in cortical responses to speech. Our results demonstrate that, during multispeaker listening, attentional gains typical of cortical responses under speech selection are met with attenuations as a consequence of prior experience. The changes were observed at the P2 processing stage (220–320 msec) of speech envelope onset processing and were specific to responses to primed speech targets (Experiment 1). Suppressions at stages earlier than the P2, or under partial priming conditions (Experiments 2a and 2b), were not observed. An exploratory analysis suggests the observed P2 reduction predicts listeners' ability to report target words, consistent with this component encoding in part temporal prediction error about onset edge cues exclusive to target speech. Our results show that at this late and definitive stage of selective attention, the auditory system may test the evidence for its own predictive model of the noise-invariant speech stream. Precise inference of its temporal structure is bound to tag all checkpoints where auditory evidence can be most reliably connected into higher-order representations of continuous speech.

查看原文本刊更多论文

不要再重复了：鸡尾酒会上的演讲期望值是如何减少跟踪的

当大脑专注于嘈杂环境中的对话时，它会利用过去的经验来优先考虑听觉场景中的相关元素。这就提出了这样一个问题：当听者获得对单一语音对象的先验经验时，语音混合的选择性神经处理发生了什么变化？在三个不同的启动实验中，我们将时间响应函数（TRF）方法应用于单次脑电图（EEG）记录，量化了连续语音中颞叶标记的皮质选择。这些设计专门研究了注意力如何与目标或背景说话者的准确（实验1）、声音（实验2a）或信息（实验2b）内容启动在大脑皮层对言语的反应中相互作用。我们的研究结果表明，在多语听力过程中，语音选择下典型的皮层反应的注意力增益由于先前的经验而衰减。这些变化主要发生在语音包膜开始加工的P2阶段（220-320 msec），并且是对启动语音目标的特异性反应（实验1）。在早于P2阶段或在部分启动条件下（实验2a和2b），没有观察到抑制。一项探索性分析表明，观察到的P2减少预测了听者报告目标单词的能力，这与该成分编码的部分时间预测误差是一致的，这些误差是针对目标语音的起始边缘线索的。我们的研究结果表明，在选择性注意的晚期和最终阶段，听觉系统可能会为其自己的噪声不变语音流预测模型测试证据。对其时间结构的精确推断必然会标记所有听觉证据可以最可靠地连接到连续语音的高阶表示的检查点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cortex 医学-行为科学

CiteScore

7.00

自引率

5.60%

发文量

250

审稿时长

74 days

期刊介绍： CORTEX is an international journal devoted to the study of cognition and of the relationship between the nervous system and mental processes, particularly as these are reflected in the behaviour of patients with acquired brain lesions, normal volunteers, children with typical and atypical development, and in the activation of brain regions and systems as recorded by functional neuroimaging techniques. It was founded in 1964 by Ennio De Renzi.