Audiovisual Speech Perception and the McGurk Effect

Oxford Research Encyclopedia of Linguistics Pub Date : 2019-08-28 DOI:10.1093/acrefore/9780199384655.013.420

L. Rosenblum

{"title":"Audiovisual Speech Perception and the McGurk Effect","authors":"L. Rosenblum","doi":"10.1093/acrefore/9780199384655.013.420","DOIUrl":null,"url":null,"abstract":"Research on visual and audiovisual speech information has profoundly influenced the fields of psycholinguistics, perception psychology, and cognitive neuroscience. Visual speech findings have provided some of most the important human demonstrations of our new conception of the perceptual brain as being supremely multimodal. This “multisensory revolution” has seen a tremendous growth in research on how the senses integrate, cross-facilitate, and share their experience with one another.\n The ubiquity and apparent automaticity of multisensory speech has led many theorists to propose that the speech brain is agnostic with regard to sense modality: it might not know or care from which modality speech information comes. Instead, the speech function may act to extract supramodal informational patterns that are common in form across energy streams. Alternatively, other theorists have argued that any common information existent across the modalities is minimal and rudimentary, so that multisensory perception largely depends on the observer’s associative experience between the streams. From this perspective, the auditory stream is typically considered primary for the speech brain, with visual speech simply appended to its processing. If the utility of multisensory speech is a consequence of a supramodal informational coherence, then cross-sensory “integration” may be primarily a consequence of the informational input itself. If true, then one would expect to see evidence for integration occurring early in the perceptual process, as well in a largely complete and automatic/impenetrable manner. Alternatively, if multisensory speech perception is based on associative experience between the modal streams, then no constraints on how completely or automatically the senses integrate are dictated. There is behavioral and neurophysiological research supporting both perspectives.\n Much of this research is based on testing the well-known McGurk effect, in which audiovisual speech information is thought to integrate to the extent that visual information can affect what listeners report hearing. However, there is now good reason to believe that the McGurk effect is not a valid test of multisensory integration. For example, there are clear cases in which responses indicate that the effect fails, while other measures suggest that integration is actually occurring. By mistakenly conflating the McGurk effect with speech integration itself, interpretations of the completeness and automaticity of multisensory may be incorrect. Future research should use more sensitive behavioral and neurophysiological measures of cross-modal influence to examine these issues.","PeriodicalId":331003,"journal":{"name":"Oxford Research Encyclopedia of Linguistics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Oxford Research Encyclopedia of Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/acrefore/9780199384655.013.420","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Research on visual and audiovisual speech information has profoundly influenced the fields of psycholinguistics, perception psychology, and cognitive neuroscience. Visual speech findings have provided some of most the important human demonstrations of our new conception of the perceptual brain as being supremely multimodal. This “multisensory revolution” has seen a tremendous growth in research on how the senses integrate, cross-facilitate, and share their experience with one another. The ubiquity and apparent automaticity of multisensory speech has led many theorists to propose that the speech brain is agnostic with regard to sense modality: it might not know or care from which modality speech information comes. Instead, the speech function may act to extract supramodal informational patterns that are common in form across energy streams. Alternatively, other theorists have argued that any common information existent across the modalities is minimal and rudimentary, so that multisensory perception largely depends on the observer’s associative experience between the streams. From this perspective, the auditory stream is typically considered primary for the speech brain, with visual speech simply appended to its processing. If the utility of multisensory speech is a consequence of a supramodal informational coherence, then cross-sensory “integration” may be primarily a consequence of the informational input itself. If true, then one would expect to see evidence for integration occurring early in the perceptual process, as well in a largely complete and automatic/impenetrable manner. Alternatively, if multisensory speech perception is based on associative experience between the modal streams, then no constraints on how completely or automatically the senses integrate are dictated. There is behavioral and neurophysiological research supporting both perspectives. Much of this research is based on testing the well-known McGurk effect, in which audiovisual speech information is thought to integrate to the extent that visual information can affect what listeners report hearing. However, there is now good reason to believe that the McGurk effect is not a valid test of multisensory integration. For example, there are clear cases in which responses indicate that the effect fails, while other measures suggest that integration is actually occurring. By mistakenly conflating the McGurk effect with speech integration itself, interpretations of the completeness and automaticity of multisensory may be incorrect. Future research should use more sensitive behavioral and neurophysiological measures of cross-modal influence to examine these issues.

查看原文本刊更多论文

视听语音感知与麦格克效应

视听语音信息的研究对心理语言学、感知心理学和认知神经科学等领域产生了深远的影响。视觉语言的发现为我们的新概念提供了一些最重要的人类证明，即感知大脑是超级多模态的。这种“多感官革命”在研究感官如何相互整合、交叉促进和分享经验方面取得了巨大的进展。多感觉语言的普遍性和明显的自动性使许多理论家提出，语言大脑对感觉情态是不可知论的:它可能不知道或不关心语言信息来自哪种情态。相反，语音功能可以提取能量流中常见形式的超模态信息模式。另一种观点是，其他理论家认为，存在于各种形态之间的任何共同信息都是最小和最基本的，因此，多感官知觉在很大程度上取决于观察者在信息流之间的联想经验。从这个角度来看，听觉流通常被认为是语言大脑的主要功能，视觉语言只是附加在其处理过程中。如果多感官语言的使用是超模态信息连贯的结果，那么跨感官“整合”可能主要是信息输入本身的结果。如果这是真的，那么人们就会期望在感知过程的早期看到整合的证据，并且以一种很大程度上完整和自动/不可穿透的方式发生。或者，如果多感官语音感知是基于模态流之间的联想经验，那么对感官整合的完全或自动程度就没有限制。行为学和神经生理学的研究都支持这两种观点。这项研究的大部分是基于对著名的麦格克效应(McGurk effect)的测试。麦格克效应认为，视听语音信息被整合到视觉信息可以影响听者所听到的内容的程度。然而，现在有充分的理由相信麦格克效应并不是多感觉整合的有效测试。例如，在一些明显的情况下，反应表明效果失败，而其他措施表明整合实际上正在发生。如果错误地将McGurk效应与语言整合本身混为一谈，那么对多感官的完整性和自动性的解释可能是不正确的。未来的研究应该使用更敏感的跨模式影响的行为和神经生理学测量来检查这些问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Oxford Research Encyclopedia of Linguistics

自引率

0.00%

发文量