{"title":"The Antecedents of Transformer Models","authors":"Simon Dennis, Kevin Shabahang, Hyungwook Yim","doi":"10.1177/09637214241279504","DOIUrl":null,"url":null,"abstract":"Transformer models of language represent a step change in our ability to account for cognitive phenomena. Although the specific architecture that has garnered recent interest is quite young, many of its components have antecedents in the cognitive science literature. In this article, we start by providing an introduction to large language models aimed at a general psychological audience. We then highlight some of the antecedents, including the importance of scale, instance-based memory models, paradigmatic association and systematicity, positional encodings of serial order, and the learning of control processes. This article offers an exploration of the relationship between transformer models and their precursors, showing how they can be understood as a next phase in our understanding of cognitive processes.","PeriodicalId":10802,"journal":{"name":"Current Directions in Psychological Science","volume":"39 1","pages":""},"PeriodicalIF":7.4000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Directions in Psychological Science","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/09637214241279504","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Transformer models of language represent a step change in our ability to account for cognitive phenomena. Although the specific architecture that has garnered recent interest is quite young, many of its components have antecedents in the cognitive science literature. In this article, we start by providing an introduction to large language models aimed at a general psychological audience. We then highlight some of the antecedents, including the importance of scale, instance-based memory models, paradigmatic association and systematicity, positional encodings of serial order, and the learning of control processes. This article offers an exploration of the relationship between transformer models and their precursors, showing how they can be understood as a next phase in our understanding of cognitive processes.
期刊介绍:
Current Directions in Psychological Science publishes reviews by leading experts covering all of scientific psychology and its applications. Each issue of Current Directions features a diverse mix of reports on various topics such as language, memory and cognition, development, the neural basis of behavior and emotions, various aspects of psychopathology, and theory of mind. These articles allow readers to stay apprised of important developments across subfields beyond their areas of expertise and bodies of research they might not otherwise be aware of. The articles in Current Directions are also written to be accessible to non-experts, making them ideally suited for use in the classroom as teaching supplements.