{"title":"Computational Sentence-Level Metrics of Reading Speed and Its Ramifications for Sentence Comprehension","authors":"Kun Sun, Rong Wang","doi":"10.1111/cogs.70092","DOIUrl":null,"url":null,"abstract":"<p>The majority of research in computational psycholinguistics on sentence processing has focused on word-by-word incremental processing within sentences, rather than holistic sentence-level representations. This study introduces two novel computational approaches for quantifying sentence-level processing: sentence surprisal and sentence relevance. Using multilingual large language models (LLMs), we compute sentence surprisal through three methods, chain rule, next sentence prediction, and negative log-likelihood, and apply a “memory-aware” approach to calculate sentence-level semantic relevance based on convolution operations. The sentence-level metrics developed are tested and compared to validate whether they can predict the reading speed of sentences, and, further, we explore how sentence-level metrics take effects on human processing and comprehending sentences as a whole across languages. The results show that sentence-level metrics are highly capable of predicting sentence reading speed. Our results also indicate that these computational sentence-level metrics are exceptionally effective at predicting and explaining the processing difficulties encountered by readers in processing sentences as a whole across a variety of languages. The proposed sentence-level metrics offer significant interpretability and achieve high accuracy in predicting human sentence reading speed, as they capture unique aspects of comprehension difficulty beyond word-level measures. These metrics serve as valuable computational tools for investigating human sentence processing and advancing our understanding of naturalistic reading. Their strong performance and generalization capabilities highlight their potential to drive progress at the intersection of LLMs and cognitive science.</p>","PeriodicalId":48349,"journal":{"name":"Cognitive Science","volume":"49 7","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.70092","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Science","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cogs.70092","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
The majority of research in computational psycholinguistics on sentence processing has focused on word-by-word incremental processing within sentences, rather than holistic sentence-level representations. This study introduces two novel computational approaches for quantifying sentence-level processing: sentence surprisal and sentence relevance. Using multilingual large language models (LLMs), we compute sentence surprisal through three methods, chain rule, next sentence prediction, and negative log-likelihood, and apply a “memory-aware” approach to calculate sentence-level semantic relevance based on convolution operations. The sentence-level metrics developed are tested and compared to validate whether they can predict the reading speed of sentences, and, further, we explore how sentence-level metrics take effects on human processing and comprehending sentences as a whole across languages. The results show that sentence-level metrics are highly capable of predicting sentence reading speed. Our results also indicate that these computational sentence-level metrics are exceptionally effective at predicting and explaining the processing difficulties encountered by readers in processing sentences as a whole across a variety of languages. The proposed sentence-level metrics offer significant interpretability and achieve high accuracy in predicting human sentence reading speed, as they capture unique aspects of comprehension difficulty beyond word-level measures. These metrics serve as valuable computational tools for investigating human sentence processing and advancing our understanding of naturalistic reading. Their strong performance and generalization capabilities highlight their potential to drive progress at the intersection of LLMs and cognitive science.
期刊介绍:
Cognitive Science publishes articles in all areas of cognitive science, covering such topics as knowledge representation, inference, memory processes, learning, problem solving, planning, perception, natural language understanding, connectionism, brain theory, motor control, intentional systems, and other areas of interdisciplinary concern. Highest priority is given to research reports that are specifically written for a multidisciplinary audience. The audience is primarily researchers in cognitive science and its associated fields, including anthropologists, education researchers, psychologists, philosophers, linguists, computer scientists, neuroscientists, and roboticists.