{"title":"Memory for prediction: A Transformer-based theory of sentence processing","authors":"Soo Hyun Ryu , Richard L. Lewis","doi":"10.1016/j.jml.2025.104670","DOIUrl":null,"url":null,"abstract":"<div><div>We demonstrate that Transformer-based neural network language models provide a new foundation for mechanistic theories of sentence processing that seamlessly integrate expectation-based and memory-based accounts. First, we show that the attention mechanism in GPT2-small operates as a kind of cue-based retrieval architecture that is subject to similarity-based interference. Second, we show that it provides accounts of classic memory effects in parsing, including contrasts involving relative clauses and center-embedding. Third, we show that a simple word-by-word entropy metric computed over the internal attention patterns provides an index of memory interference that explains variance in eye-tracking and self-paced reading time measures (independent of surprisal and other predictors) in two natural story reading time corpora. Because the cues and representations are learned, there is no need for the theorist to postulate representational features and cues. Transformers provide practical modeling tools for exploring the effects of memory and experience, given the increasing availability of both pre-trained models and software for training new models, and the ease with which surprisal and attention entropy metrics may be computed.</div></div>","PeriodicalId":16493,"journal":{"name":"Journal of memory and language","volume":"145 ","pages":"Article 104670"},"PeriodicalIF":3.0000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of memory and language","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0749596X25000634","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
We demonstrate that Transformer-based neural network language models provide a new foundation for mechanistic theories of sentence processing that seamlessly integrate expectation-based and memory-based accounts. First, we show that the attention mechanism in GPT2-small operates as a kind of cue-based retrieval architecture that is subject to similarity-based interference. Second, we show that it provides accounts of classic memory effects in parsing, including contrasts involving relative clauses and center-embedding. Third, we show that a simple word-by-word entropy metric computed over the internal attention patterns provides an index of memory interference that explains variance in eye-tracking and self-paced reading time measures (independent of surprisal and other predictors) in two natural story reading time corpora. Because the cues and representations are learned, there is no need for the theorist to postulate representational features and cues. Transformers provide practical modeling tools for exploring the effects of memory and experience, given the increasing availability of both pre-trained models and software for training new models, and the ease with which surprisal and attention entropy metrics may be computed.
期刊介绍:
Articles in the Journal of Memory and Language contribute to the formulation of scientific issues and theories in the areas of memory, language comprehension and production, and cognitive processes. Special emphasis is given to research articles that provide new theoretical insights based on a carefully laid empirical foundation. The journal generally favors articles that provide multiple experiments. In addition, significant theoretical papers without new experimental findings may be published.
The Journal of Memory and Language is a valuable tool for cognitive scientists, including psychologists, linguists, and others interested in memory and learning, language, reading, and speech.
Research Areas include:
• Topics that illuminate aspects of memory or language processing
• Linguistics
• Neuropsychology.