A Computational Framework to Study Hierarchical Processing in Visual Narratives

IF 2.4 2区心理学 Q2 PSYCHOLOGY, EXPERIMENTAL

Cognitive Science Pub Date : 2025-05-02 DOI:10.1111/cogs.70050

Aditya Upadhyayula, Neil Cohn

{"title":"A Computational Framework to Study Hierarchical Processing in Visual Narratives","authors":"Aditya Upadhyayula, Neil Cohn","doi":"10.1111/cogs.70050","DOIUrl":null,"url":null,"abstract":"<p>Theories of visual narrative comprehension have advocated for a hierarchical grammar-based comprehension mechanism, but only limited work has investigated this hierarchy. Here, we provide a computational framework inspired by computational psycholinguistics to address hierarchy in visual narratives. The predictions generated by this framework were compared against behavior data to draw inferences about the hierarchical properties of visual narratives. A segmentation task—where participants ranked all possible segmental boundaries—demonstrated that participants’ preferences were predicted by visual narrative grammar. Three kinds of models using surprisal theory—an Earley parser, a hidden Markov model (HMM), and an n-gram model—were then used to generate segmentation preferences for the same task. Earley parser's preferences were based on a hierarchical grammar with recursion properties, while the HMM and the n-grams used a flattened grammar for visual narrative comprehension. Given the differences in the mechanics of these models, contrasting their predictions against behavior data could provide crucial insights into understanding the underlying mechanisms of visual narrative comprehension. By investigating grammatical systems outside of language, this research provides new directions to explore the generic makeup of the cognitive structure of mental representations.</p>","PeriodicalId":48349,"journal":{"name":"Cognitive Science","volume":"49 5","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.70050","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Science","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cogs.70050","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

Theories of visual narrative comprehension have advocated for a hierarchical grammar-based comprehension mechanism, but only limited work has investigated this hierarchy. Here, we provide a computational framework inspired by computational psycholinguistics to address hierarchy in visual narratives. The predictions generated by this framework were compared against behavior data to draw inferences about the hierarchical properties of visual narratives. A segmentation task—where participants ranked all possible segmental boundaries—demonstrated that participants’ preferences were predicted by visual narrative grammar. Three kinds of models using surprisal theory—an Earley parser, a hidden Markov model (HMM), and an n-gram model—were then used to generate segmentation preferences for the same task. Earley parser's preferences were based on a hierarchical grammar with recursion properties, while the HMM and the n-grams used a flattened grammar for visual narrative comprehension. Given the differences in the mechanics of these models, contrasting their predictions against behavior data could provide crucial insights into understanding the underlying mechanisms of visual narrative comprehension. By investigating grammatical systems outside of language, this research provides new directions to explore the generic makeup of the cognitive structure of mental representations.

Abstract Image

查看原文本刊更多论文

一个研究视觉叙事中层次处理的计算框架

视觉叙事理解理论主张一种基于语法的分层理解机制，但对这种分层理解机制的研究有限。在这里，我们提供了一个受计算心理语言学启发的计算框架来解决视觉叙事中的层次问题。由该框架生成的预测与行为数据进行比较，得出关于视觉叙事的层次属性的推论。在分割任务中，参与者对所有可能的分割边界进行排序，结果表明参与者的偏好是由视觉叙事语法预测的。然后使用三种使用surprisal理论的模型——Earley解析器、隐马尔可夫模型（HMM）和n-gram模型——来为同一任务生成分割偏好。Earley解析器的偏好基于具有递归属性的分层语法，而HMM和n-gram使用扁平语法进行视觉叙事理解。考虑到这些模型的机制差异，将它们的预测与行为数据进行对比，可以为理解视觉叙事理解的潜在机制提供至关重要的见解。通过研究语言之外的语法系统，本研究为探索心理表征的认知结构的一般构成提供了新的方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cognitive Science PSYCHOLOGY, EXPERIMENTAL-

CiteScore

4.10

自引率

8.00%

发文量

139

期刊介绍： Cognitive Science publishes articles in all areas of cognitive science, covering such topics as knowledge representation, inference, memory processes, learning, problem solving, planning, perception, natural language understanding, connectionism, brain theory, motor control, intentional systems, and other areas of interdisciplinary concern. Highest priority is given to research reports that are specifically written for a multidisciplinary audience. The audience is primarily researchers in cognitive science and its associated fields, including anthropologists, education researchers, psychologists, philosophers, linguists, computer scientists, neuroscientists, and roboticists.