{"title":"A Computational Framework to Study Hierarchical Processing in Visual Narratives","authors":"Aditya Upadhyayula, Neil Cohn","doi":"10.1111/cogs.70050","DOIUrl":null,"url":null,"abstract":"<p>Theories of visual narrative comprehension have advocated for a hierarchical grammar-based comprehension mechanism, but only limited work has investigated this hierarchy. Here, we provide a computational framework inspired by computational psycholinguistics to address hierarchy in visual narratives. The predictions generated by this framework were compared against behavior data to draw inferences about the hierarchical properties of visual narratives. A segmentation task—where participants ranked all possible segmental boundaries—demonstrated that participants’ preferences were predicted by visual narrative grammar. Three kinds of models using surprisal theory—an Earley parser, a hidden Markov model (HMM), and an n-gram model—were then used to generate segmentation preferences for the same task. Earley parser's preferences were based on a hierarchical grammar with recursion properties, while the HMM and the n-grams used a flattened grammar for visual narrative comprehension. Given the differences in the mechanics of these models, contrasting their predictions against behavior data could provide crucial insights into understanding the underlying mechanisms of visual narrative comprehension. By investigating grammatical systems outside of language, this research provides new directions to explore the generic makeup of the cognitive structure of mental representations.</p>","PeriodicalId":48349,"journal":{"name":"Cognitive Science","volume":"49 5","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.70050","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Science","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cogs.70050","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Theories of visual narrative comprehension have advocated for a hierarchical grammar-based comprehension mechanism, but only limited work has investigated this hierarchy. Here, we provide a computational framework inspired by computational psycholinguistics to address hierarchy in visual narratives. The predictions generated by this framework were compared against behavior data to draw inferences about the hierarchical properties of visual narratives. A segmentation task—where participants ranked all possible segmental boundaries—demonstrated that participants’ preferences were predicted by visual narrative grammar. Three kinds of models using surprisal theory—an Earley parser, a hidden Markov model (HMM), and an n-gram model—were then used to generate segmentation preferences for the same task. Earley parser's preferences were based on a hierarchical grammar with recursion properties, while the HMM and the n-grams used a flattened grammar for visual narrative comprehension. Given the differences in the mechanics of these models, contrasting their predictions against behavior data could provide crucial insights into understanding the underlying mechanisms of visual narrative comprehension. By investigating grammatical systems outside of language, this research provides new directions to explore the generic makeup of the cognitive structure of mental representations.
期刊介绍:
Cognitive Science publishes articles in all areas of cognitive science, covering such topics as knowledge representation, inference, memory processes, learning, problem solving, planning, perception, natural language understanding, connectionism, brain theory, motor control, intentional systems, and other areas of interdisciplinary concern. Highest priority is given to research reports that are specifically written for a multidisciplinary audience. The audience is primarily researchers in cognitive science and its associated fields, including anthropologists, education researchers, psychologists, philosophers, linguists, computer scientists, neuroscientists, and roboticists.