Computational Sentence-Level Metrics of Reading Speed and Its Ramifications for Sentence Comprehension

IF 2.4 2区心理学 Q2 PSYCHOLOGY, EXPERIMENTAL

Cognitive Science Pub Date : 2025-07-22 DOI:10.1111/cogs.70092

Kun Sun, Rong Wang

{"title":"Computational Sentence-Level Metrics of Reading Speed and Its Ramifications for Sentence Comprehension","authors":"Kun Sun, Rong Wang","doi":"10.1111/cogs.70092","DOIUrl":null,"url":null,"abstract":"<p>The majority of research in computational psycholinguistics on sentence processing has focused on word-by-word incremental processing within sentences, rather than holistic sentence-level representations. This study introduces two novel computational approaches for quantifying sentence-level processing: sentence surprisal and sentence relevance. Using multilingual large language models (LLMs), we compute sentence surprisal through three methods, chain rule, next sentence prediction, and negative log-likelihood, and apply a “memory-aware” approach to calculate sentence-level semantic relevance based on convolution operations. The sentence-level metrics developed are tested and compared to validate whether they can predict the reading speed of sentences, and, further, we explore how sentence-level metrics take effects on human processing and comprehending sentences as a whole across languages. The results show that sentence-level metrics are highly capable of predicting sentence reading speed. Our results also indicate that these computational sentence-level metrics are exceptionally effective at predicting and explaining the processing difficulties encountered by readers in processing sentences as a whole across a variety of languages. The proposed sentence-level metrics offer significant interpretability and achieve high accuracy in predicting human sentence reading speed, as they capture unique aspects of comprehension difficulty beyond word-level measures. These metrics serve as valuable computational tools for investigating human sentence processing and advancing our understanding of naturalistic reading. Their strong performance and generalization capabilities highlight their potential to drive progress at the intersection of LLMs and cognitive science.</p>","PeriodicalId":48349,"journal":{"name":"Cognitive Science","volume":"49 7","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.70092","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Science","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cogs.70092","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

The majority of research in computational psycholinguistics on sentence processing has focused on word-by-word incremental processing within sentences, rather than holistic sentence-level representations. This study introduces two novel computational approaches for quantifying sentence-level processing: sentence surprisal and sentence relevance. Using multilingual large language models (LLMs), we compute sentence surprisal through three methods, chain rule, next sentence prediction, and negative log-likelihood, and apply a “memory-aware” approach to calculate sentence-level semantic relevance based on convolution operations. The sentence-level metrics developed are tested and compared to validate whether they can predict the reading speed of sentences, and, further, we explore how sentence-level metrics take effects on human processing and comprehending sentences as a whole across languages. The results show that sentence-level metrics are highly capable of predicting sentence reading speed. Our results also indicate that these computational sentence-level metrics are exceptionally effective at predicting and explaining the processing difficulties encountered by readers in processing sentences as a whole across a variety of languages. The proposed sentence-level metrics offer significant interpretability and achieve high accuracy in predicting human sentence reading speed, as they capture unique aspects of comprehension difficulty beyond word-level measures. These metrics serve as valuable computational tools for investigating human sentence processing and advancing our understanding of naturalistic reading. Their strong performance and generalization capabilities highlight their potential to drive progress at the intersection of LLMs and cognitive science.

Abstract Image

查看原文本刊更多论文

计算句级阅读速度度量及其对句子理解的影响

在计算心理语言学中，大多数关于句子处理的研究都集中在句子内逐字的增量处理，而不是整体的句子级表征。本研究引入了两种新的计算方法来量化句子级处理：句子惊讶性和句子相关性。使用多语言大语言模型（llm），我们通过链式法则、下一个句子预测和负对数似然三种方法计算句子的惊讶度，并应用“记忆感知”方法计算基于卷积操作的句子级语义相关性。我们对所开发的句子级指标进行了测试和比较，以验证它们是否可以预测句子的阅读速度，此外，我们还探讨了句子级指标如何影响人类跨语言处理和理解整个句子。结果表明，句子级度量指标能够较好地预测句子阅读速度。我们的研究结果还表明，这些计算句子级指标在预测和解释读者在处理各种语言的句子时所遇到的处理困难方面非常有效。所提出的句子级度量提供了显著的可解释性，并且在预测人类句子阅读速度方面达到了很高的准确性，因为它们捕获了超越单词级度量的理解难度的独特方面。这些指标是研究人类句子处理和提高我们对自然阅读理解的有价值的计算工具。它们强大的性能和泛化能力突出了它们在法学硕士和认知科学交叉领域推动进步的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cognitive Science PSYCHOLOGY, EXPERIMENTAL-

CiteScore

4.10

自引率

8.00%

发文量

139

期刊介绍： Cognitive Science publishes articles in all areas of cognitive science, covering such topics as knowledge representation, inference, memory processes, learning, problem solving, planning, perception, natural language understanding, connectionism, brain theory, motor control, intentional systems, and other areas of interdisciplinary concern. Highest priority is given to research reports that are specifically written for a multidisciplinary audience. The audience is primarily researchers in cognitive science and its associated fields, including anthropologists, education researchers, psychologists, philosophers, linguists, computer scientists, neuroscientists, and roboticists.