揭示句子加工中语义可预测性的模式

IF 3 1区心理学 Q1 LINGUISTICS

Journal of memory and language Pub Date : 2025-05-23 DOI:10.1016/j.jml.2025.104653

Cassandra L. Jacobs , Ryan J. Hubbard , Loïc Grobol , Kara D. Federmeier

{"title":"揭示句子加工中语义可预测性的模式","authors":"Cassandra L. Jacobs , Ryan J. Hubbard , Loïc Grobol , Kara D. Federmeier","doi":"10.1016/j.jml.2025.104653","DOIUrl":null,"url":null,"abstract":"<div><div>Psycholinguistic researchers often collect cloze probabilities in order to measure the predictability of upcoming words but have largely discarded the variability in the structure of responses people provide. This variability in the semantic structure of responses may be important for understanding selection during language production; however, it has proven difficult to model the semantic variability of participants’ responses, and thus upcoming semantic uncertainty. Recent advances in large language models (LLMs) permit us to approximate the degree of semantic variability in cloze responses, but most methods are restricted to symbolic or hand-crafted meaning representations. We show in two studies that Bayesian Gaussian mixture models can cluster LLM representations of participants’ responses and produce coherent, taxonomically similar clusters. We apply these clustering algorithms to response time data in a serial cloze task and show that the semantic structure of cloze responses influences how quickly people are able to provide a response. We show clear effects of semantic competition on production speed. In addition to providing novel operationalizations of what semantic competition might look like in the cloze task, we explain how this clustering method is extensible to other datasets and applications of interest to researchers of semantic processing in psycholinguistics.</div></div>","PeriodicalId":16493,"journal":{"name":"Journal of memory and language","volume":"144 ","pages":"Article 104653"},"PeriodicalIF":3.0000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Uncovering patterns of semantic predictability in sentence processing\",\"authors\":\"Cassandra L. Jacobs , Ryan J. Hubbard , Loïc Grobol , Kara D. Federmeier\",\"doi\":\"10.1016/j.jml.2025.104653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Psycholinguistic researchers often collect cloze probabilities in order to measure the predictability of upcoming words but have largely discarded the variability in the structure of responses people provide. This variability in the semantic structure of responses may be important for understanding selection during language production; however, it has proven difficult to model the semantic variability of participants’ responses, and thus upcoming semantic uncertainty. Recent advances in large language models (LLMs) permit us to approximate the degree of semantic variability in cloze responses, but most methods are restricted to symbolic or hand-crafted meaning representations. We show in two studies that Bayesian Gaussian mixture models can cluster LLM representations of participants’ responses and produce coherent, taxonomically similar clusters. We apply these clustering algorithms to response time data in a serial cloze task and show that the semantic structure of cloze responses influences how quickly people are able to provide a response. We show clear effects of semantic competition on production speed. In addition to providing novel operationalizations of what semantic competition might look like in the cloze task, we explain how this clustering method is extensible to other datasets and applications of interest to researchers of semantic processing in psycholinguistics.</div></div>\",\"PeriodicalId\":16493,\"journal\":{\"name\":\"Journal of memory and language\",\"volume\":\"144 \",\"pages\":\"Article 104653\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of memory and language\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0749596X25000464\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of memory and language","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0749596X25000464","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"LINGUISTICS","Score":null,"Total":0}

引用次数: 0

摘要

心理语言学研究人员经常收集完形概率，以衡量即将出现的单词的可预测性，但在很大程度上忽略了人们提供的反应结构的可变性。这种反应语义结构的可变性可能对理解语言产生过程中的选择很重要；然而，事实证明很难对参与者反应的语义变异性进行建模，从而导致即将到来的语义不确定性。大型语言模型（llm）的最新进展使我们能够估计完形填空响应中语义变化的程度，但大多数方法仅限于符号或手工制作的意义表示。我们在两项研究中表明，贝叶斯高斯混合模型可以聚类参与者反应的LLM表示，并产生连贯的、分类上相似的聚类。我们将这些聚类算法应用于一个连续完形填空任务的响应时间数据，并表明完形填空响应的语义结构影响人们能够提供响应的速度。我们展示了语义竞争对生产速度的明显影响。除了在完形填空任务中提供语义竞争的新操作外，我们还解释了这种聚类方法如何扩展到其他数据集以及心理语言学语义处理研究人员感兴趣的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Uncovering patterns of semantic predictability in sentence processing

Psycholinguistic researchers often collect cloze probabilities in order to measure the predictability of upcoming words but have largely discarded the variability in the structure of responses people provide. This variability in the semantic structure of responses may be important for understanding selection during language production; however, it has proven difficult to model the semantic variability of participants’ responses, and thus upcoming semantic uncertainty. Recent advances in large language models (LLMs) permit us to approximate the degree of semantic variability in cloze responses, but most methods are restricted to symbolic or hand-crafted meaning representations. We show in two studies that Bayesian Gaussian mixture models can cluster LLM representations of participants’ responses and produce coherent, taxonomically similar clusters. We apply these clustering algorithms to response time data in a serial cloze task and show that the semantic structure of cloze responses influences how quickly people are able to provide a response. We show clear effects of semantic competition on production speed. In addition to providing novel operationalizations of what semantic competition might look like in the cloze task, we explain how this clustering method is extensible to other datasets and applications of interest to researchers of semantic processing in psycholinguistics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of memory and language 医学-心理学

CiteScore

8.70

自引率

14.00%

发文量

审稿时长

12.7 weeks

期刊介绍： Articles in the Journal of Memory and Language contribute to the formulation of scientific issues and theories in the areas of memory, language comprehension and production, and cognitive processes. Special emphasis is given to research articles that provide new theoretical insights based on a carefully laid empirical foundation. The journal generally favors articles that provide multiple experiments. In addition, significant theoretical papers without new experimental findings may be published. The Journal of Memory and Language is a valuable tool for cognitive scientists, including psychologists, linguists, and others interested in memory and learning, language, reading, and speech. Research Areas include: • Topics that illuminate aspects of memory or language processing • Linguistics • Neuropsychology.