Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.

IF 7.2 1区 生物学 Q1 Agricultural and Biological Sciences
PLoS Biology Pub Date : 2025-07-23 eCollection Date: 2025-07-01 DOI:10.1371/journal.pbio.3003293
Jong-Yun Park, Mitsuaki Tsukamoto, Misato Tanaka, Yukiyasu Kamitani
{"title":"Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.","authors":"Jong-Yun Park, Mitsuaki Tsukamoto, Misato Tanaka, Yukiyasu Kamitani","doi":"10.1371/journal.pbio.3003293","DOIUrl":null,"url":null,"abstract":"<p><p>Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention (\"cocktail party\") task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences-a step toward more comprehensive understanding and reconstruction of internal auditory representations.</p>","PeriodicalId":49001,"journal":{"name":"PLoS Biology","volume":"23 7","pages":"e3003293"},"PeriodicalIF":7.2000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12313072/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pbio.3003293","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention ("cocktail party") task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences-a step toward more comprehensive understanding and reconstruction of internal auditory representations.

利用深度神经网络表征,可以从人类神经成像数据重构自然声音。
从大脑活动中重建感知经验提供了一个独特的窗口,了解群体神经反应如何代表感觉信息。尽管从功能性磁共振成像(fMRI)解码视觉内容已经取得了重大成功,但由于听觉信号的精细时间结构和fMRI的粗时间分辨率,重建任意声音仍然具有挑战性。利用时间窗逐渐增大的深度神经网络(DNN)的层次听觉特征及其神经活动对应关系,我们引入了一种声音重建方法,该方法将DNN特征的大脑解码与音频生成模型相结合。从听觉皮层活动解码的DNN特征优于光谱时间和基于调制的特征,能够在不同的声音类别中实现感知上合理的重建。行为评估和客观测量证实,这些重建保留了短期频谱和感知特性,捕获了语音、动物叫声和乐器的特征音色,而重建的声音无法再现更长的时间序列。留类分析表明,该方法可以推广到不同的声音类别。在较高级DNN层和早期听觉区域的重建显示出对解码性能的不同贡献。将该模型应用于选择性听觉注意(“鸡尾酒会”)任务进一步表明,在一些受试者中,被注意的声音比未被注意的声音更能反映被注意的声音。尽管它无法重建精确的时间序列,这可能反映了fMRI有限的时间分辨率,但我们的框架证明了将大脑活动映射到听觉体验的可行性,这是迈向更全面理解和重建内部听觉表征的一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PLoS Biology
PLoS Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-BIOLOGY
CiteScore
15.40
自引率
2.00%
发文量
359
审稿时长
3-8 weeks
期刊介绍: PLOS Biology is the flagship journal of the Public Library of Science (PLOS) and focuses on publishing groundbreaking and relevant research in all areas of biological science. The journal features works at various scales, ranging from molecules to ecosystems, and also encourages interdisciplinary studies. PLOS Biology publishes articles that demonstrate exceptional significance, originality, and relevance, with a high standard of scientific rigor in methodology, reporting, and conclusions. The journal aims to advance science and serve the research community by transforming research communication to align with the research process. It offers evolving article types and policies that empower authors to share the complete story behind their scientific findings with a diverse global audience of researchers, educators, policymakers, patient advocacy groups, and the general public. PLOS Biology, along with other PLOS journals, is widely indexed by major services such as Crossref, Dimensions, DOAJ, Google Scholar, PubMed, PubMed Central, Scopus, and Web of Science. Additionally, PLOS Biology is indexed by various other services including AGRICOLA, Biological Abstracts, BIOSYS Previews, CABI CAB Abstracts, CABI Global Health, CAPES, CAS, CNKI, Embase, Journal Guide, MEDLINE, and Zoological Record, ensuring that the research content is easily accessible and discoverable by a wide range of audiences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信