Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes.

IF 4 2区 医学 Q1 NEUROSCIENCES
Kiki van der Heijden, Prachi Patel, Stephan Bickel, Jose L Herrero, Ashesh D Mehta, Nima Mesgarani
{"title":"Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes.","authors":"Kiki van der Heijden, Prachi Patel, Stephan Bickel, Jose L Herrero, Ashesh D Mehta, Nima Mesgarani","doi":"10.1523/JNEUROSCI.0754-25.2025","DOIUrl":null,"url":null,"abstract":"<p><p>Listeners effortlessly extract multidimensional auditory objects, such as a localized talker, from complex acoustic scenes. However, the neural mechanisms that enable simultaneous encoding and linking of distinct sound features-such as a talker's voice and location-are not fully understood. Using invasive intracranial recordings in seven neurosurgical patients (4 male, 3 female), we investigated how the human auditory cortex processes and integrates these features during naturalistic multi-talker scenes and how attentional mechanisms modulate such feature integration. We found that cortical sites exhibit a continuum of feature sensitivity, ranging from single-feature sensitive sites (responsive primarily to voice spectral features or to location features) to dual-feature sensitive sites (responsive to both features). At the population level, neural response patterns from both single- and dual-feature sensitive sites jointly encoded the attended talker's voice and location. Notably, single-feature sensitive sites encoded their primary feature with greater precision but also represented coarse information about the secondary feature. Sites selectively tracking a single, attended speech stream concurrently encoded both voice and location features, demonstrating a link between selective attention and feature integration. Additionally, attention selectively enhanced temporal coherence between voice- and location-sensitive sites, suggesting that temporal synchronization serves as a mechanism for linking these features. Our findings highlight two complementary neural mechanisms-joint population coding and temporal coherence-that enable the integration of voice and location features in the auditory cortex. These results provide new insights into the distributed, multidimensional nature of auditory object formation during active listening in complex environments.<b>Significance statement</b> In everyday life, listeners effortlessly extract individual sound sources from complex acoustic scenes which contain multiple sound sources. Yet, how the brain links the different features of a particular sound source to each other - such as a talker's voice characteristics and location - is poorly understood. Here, we show that two neural mechanisms contribute to encoding and integrating voice and location features in multi-talker sound scenes: (1) some neuronal sites are sensitive to both voice and location and their activity patterns encode these features jointly; (2) the responses of neuronal sites that process only one sound feature - that is, location or voice - align temporally to form a stream that is segregated from the other talker.</p>","PeriodicalId":50114,"journal":{"name":"Journal of Neuroscience","volume":" ","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1523/JNEUROSCI.0754-25.2025","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Listeners effortlessly extract multidimensional auditory objects, such as a localized talker, from complex acoustic scenes. However, the neural mechanisms that enable simultaneous encoding and linking of distinct sound features-such as a talker's voice and location-are not fully understood. Using invasive intracranial recordings in seven neurosurgical patients (4 male, 3 female), we investigated how the human auditory cortex processes and integrates these features during naturalistic multi-talker scenes and how attentional mechanisms modulate such feature integration. We found that cortical sites exhibit a continuum of feature sensitivity, ranging from single-feature sensitive sites (responsive primarily to voice spectral features or to location features) to dual-feature sensitive sites (responsive to both features). At the population level, neural response patterns from both single- and dual-feature sensitive sites jointly encoded the attended talker's voice and location. Notably, single-feature sensitive sites encoded their primary feature with greater precision but also represented coarse information about the secondary feature. Sites selectively tracking a single, attended speech stream concurrently encoded both voice and location features, demonstrating a link between selective attention and feature integration. Additionally, attention selectively enhanced temporal coherence between voice- and location-sensitive sites, suggesting that temporal synchronization serves as a mechanism for linking these features. Our findings highlight two complementary neural mechanisms-joint population coding and temporal coherence-that enable the integration of voice and location features in the auditory cortex. These results provide new insights into the distributed, multidimensional nature of auditory object formation during active listening in complex environments.Significance statement In everyday life, listeners effortlessly extract individual sound sources from complex acoustic scenes which contain multiple sound sources. Yet, how the brain links the different features of a particular sound source to each other - such as a talker's voice characteristics and location - is poorly understood. Here, we show that two neural mechanisms contribute to encoding and integrating voice and location features in multi-talker sound scenes: (1) some neuronal sites are sensitive to both voice and location and their activity patterns encode these features jointly; (2) the responses of neuronal sites that process only one sound feature - that is, location or voice - align temporally to form a stream that is segregated from the other talker.

联合种群编码和时间相干将自然多话话人场景中与会话话人的声音和位置特征联系起来。
听众可以毫不费力地从复杂的声学场景中提取多维听觉对象,比如一个本地化的说话者。然而,能够同时编码和连接不同声音特征的神经机制——比如说话者的声音和位置——还没有被完全理解。通过对7名神经外科患者(4名男性,3名女性)的侵入性颅内记录,我们研究了人类听觉皮层如何在自然的多人说话场景中处理和整合这些特征,以及注意机制如何调节这种特征整合。我们发现皮质部位表现出连续的特征敏感性,从单特征敏感部位(主要对声音频谱特征或位置特征作出反应)到双特征敏感部位(对两种特征都作出反应)。在人群水平上,来自单特征和双特征敏感部位的神经反应模式共同编码了与会谈话者的声音和位置。值得注意的是,单特征敏感位点对其主要特征进行了更精确的编码,但也表示了关于次要特征的粗略信息。网站选择性地跟踪一个单独的、参与的语音流,同时对语音和位置特征进行编码,证明了选择性注意和特征集成之间的联系。此外,注意力选择性地增强了声音和位置敏感部位之间的时间一致性,表明时间同步是连接这些特征的机制。我们的研究结果强调了两种互补的神经机制——联合种群编码和时间一致性——这两种机制使得听觉皮层中声音和位置特征的整合成为可能。这些结果为在复杂环境中主动倾听时听觉对象形成的分布、多维性质提供了新的见解。在日常生活中,听众可以毫不费力地从包含多个声源的复杂声学场景中提取单个声源。然而,大脑是如何将一个特定声源的不同特征相互联系起来的——比如说话者的声音特征和位置——人们知之甚少。本研究表明,在多说话者声音场景中,两种神经机制有助于语音和位置特征的编码和整合:(1)一些神经元位点对语音和位置特征都很敏感,它们的活动模式共同编码这些特征;(2)只处理一种声音特征(即位置或声音)的神经元位点的反应会暂时对齐,形成与其他说话者分离的流。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Neuroscience
Journal of Neuroscience 医学-神经科学
CiteScore
9.30
自引率
3.80%
发文量
1164
审稿时长
12 months
期刊介绍: JNeurosci (ISSN 0270-6474) is an official journal of the Society for Neuroscience. It is published weekly by the Society, fifty weeks a year, one volume a year. JNeurosci publishes papers on a broad range of topics of general interest to those working on the nervous system. Authors now have an Open Choice option for their published articles
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信