Point-light Talkers: Multisensory Enhancement of Speech Tracking by Co-speech Movement Kinematics.

IF 3 3区 医学 Q2 NEUROSCIENCES
Jacob P Momsen, Seana Coulson
{"title":"Point-light Talkers: Multisensory Enhancement of Speech Tracking by Co-speech Movement Kinematics.","authors":"Jacob P Momsen, Seana Coulson","doi":"10.1162/jocn.a.62","DOIUrl":null,"url":null,"abstract":"<p><p>While multisensory super-additivity has been demonstrated in the context of visual articulation, it is unclear whether speech and co-speech gestures are similarly subject to super-additive integration. The current study investigates multisensory integration of speech and bodily gestures, testing whether biological motion signatures of co-speech gestures enhance cortical tracking of the speech envelope. We recorded EEG from 20 healthy adults as they watched a series of multimodal discourse clips from four conditions: AV congruent clips with co-speech gestures that were naturally aligned with speech, AV incongruent clips in which gestures were not aligned with the speech, audio-only clips in which speech was delivered in isolation, and video-only clips presenting the gesture content with no accompanying speech. As we hypothesize that the kinematics of co-speech gestures are sufficient to drive gestural enhancement of speech, our clips employed minimalistic \"point-light\" depictions of a speaker's movements: point-light talkers. Using neural decoder models to predict the amplitude of the speech envelope from EEG elicited in all four conditions, we compared speech reconstruction performance between multisensory (AV congruent) and additive models, that is, those representing the summed neural response across the two unisensory conditions. We found significant improvement in decoder scores for models trained on AV congruent trials relative to both audio-only and additive models. Forward models of brain activity indicated signatures of multisensory integration 140-160 msec following changes to the speech envelope. These results provide novel evidence for a multisensory enhancement effect of co-speech gesture kinematics on continuous speech tracking.</p>","PeriodicalId":51081,"journal":{"name":"Journal of Cognitive Neuroscience","volume":" ","pages":"1-16"},"PeriodicalIF":3.0000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cognitive Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1162/jocn.a.62","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

While multisensory super-additivity has been demonstrated in the context of visual articulation, it is unclear whether speech and co-speech gestures are similarly subject to super-additive integration. The current study investigates multisensory integration of speech and bodily gestures, testing whether biological motion signatures of co-speech gestures enhance cortical tracking of the speech envelope. We recorded EEG from 20 healthy adults as they watched a series of multimodal discourse clips from four conditions: AV congruent clips with co-speech gestures that were naturally aligned with speech, AV incongruent clips in which gestures were not aligned with the speech, audio-only clips in which speech was delivered in isolation, and video-only clips presenting the gesture content with no accompanying speech. As we hypothesize that the kinematics of co-speech gestures are sufficient to drive gestural enhancement of speech, our clips employed minimalistic "point-light" depictions of a speaker's movements: point-light talkers. Using neural decoder models to predict the amplitude of the speech envelope from EEG elicited in all four conditions, we compared speech reconstruction performance between multisensory (AV congruent) and additive models, that is, those representing the summed neural response across the two unisensory conditions. We found significant improvement in decoder scores for models trained on AV congruent trials relative to both audio-only and additive models. Forward models of brain activity indicated signatures of multisensory integration 140-160 msec following changes to the speech envelope. These results provide novel evidence for a multisensory enhancement effect of co-speech gesture kinematics on continuous speech tracking.

点光说话者:用共语音运动运动学增强语音跟踪的多感官。
虽然多感官超加性已经在视觉发音的背景下得到了证明,但语音和共语音手势是否同样受到超加性整合的影响尚不清楚。目前的研究调查了语言和身体手势的多感官整合,测试了共同语言手势的生物运动特征是否增强了皮层对语言包络的跟踪。我们记录了20名健康成人在观看一系列四种情况下的多模态话语片段时的脑电图:AV一致的片段与共同语言的手势自然对齐,AV不一致的片段与手势不对齐,语音单独传递的音频片段,以及只呈现手势内容而不伴随语音的视频片段。由于我们假设共语手势的运动学足以驱动语音的手势增强,我们的剪辑采用了极简主义的“点光”描述说话者的动作:点光谈话者。使用神经解码器模型来预测四种情况下脑电图的语音包络振幅,我们比较了多感觉(AV一致)和加性模型(即代表两种无感觉情况下的神经反应总和的模型)之间的语音重建性能。我们发现,相对于纯音频模型和添加模型,在AV一致性试验上训练的模型的解码器分数有显著提高。大脑活动的正向模型表明,在言语包络变化后的140-160毫秒内,会出现多感觉整合的特征。这些结果为同语音手势运动学在连续语音跟踪中的多感官增强效应提供了新的证据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Cognitive Neuroscience
Journal of Cognitive Neuroscience 医学-神经科学
CiteScore
5.30
自引率
3.10%
发文量
151
审稿时长
3-8 weeks
期刊介绍: Journal of Cognitive Neuroscience investigates brain–behavior interaction and promotes lively interchange among the mind sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信