Chunk-based incremental processing and learning: An integrated theory of word discovery, implicit statistical learning, and speed of lexical processing.
{"title":"Chunk-based incremental processing and learning: An integrated theory of word discovery, implicit statistical learning, and speed of lexical processing.","authors":"Andrew Jessop, Julian Pine, Fernand Gobet","doi":"10.1037/rev0000564","DOIUrl":null,"url":null,"abstract":"<p><p>According to chunking theories, children discover their first words by extracting subsequences embedded in their continuous input. However, the mechanisms proposed in these accounts are often incompatible with data from other areas of language development. We present a new theory to connect the chunking accounts of word discovery with the broader developmental literature. We argue that (a) children build a diverse collection of chunks, including words, multiword phrases, and sublexical units; (b) these chunks have different processing times determined by how often each chunk is used to recode the input; and (c) these processing times interact with short-term memory limitations and incremental processing to constrain learning. We implemented this theory as a computational modeling architecture called Chunk-Based Incremental Processing and Learning (CIPAL). Across nine studies, we demonstrate that CIPAL can model word discovery in different contexts. First, we trained the model with 70 child-directed speech corpora from 15 languages. CIPAL gradually discovered words in each language, with cross-linguistic variation in performance. The model's average processing time also improved with experience, resembling the developmental changes observed in children's speed of processing. Second, we showed that CIPAL could simulate seven influential effects reported in statistical learning experiments with artificial languages. This included a preference for words over nonwords, part words, frequency-matched part words, phantom words, and sublexical units. On this basis, we argue that incremental chunking is an effective implicit statistical learning mechanism that may be central to children's vocabulary development. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":21016,"journal":{"name":"Psychological review","volume":" ","pages":""},"PeriodicalIF":5.1000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological review","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/rev0000564","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
According to chunking theories, children discover their first words by extracting subsequences embedded in their continuous input. However, the mechanisms proposed in these accounts are often incompatible with data from other areas of language development. We present a new theory to connect the chunking accounts of word discovery with the broader developmental literature. We argue that (a) children build a diverse collection of chunks, including words, multiword phrases, and sublexical units; (b) these chunks have different processing times determined by how often each chunk is used to recode the input; and (c) these processing times interact with short-term memory limitations and incremental processing to constrain learning. We implemented this theory as a computational modeling architecture called Chunk-Based Incremental Processing and Learning (CIPAL). Across nine studies, we demonstrate that CIPAL can model word discovery in different contexts. First, we trained the model with 70 child-directed speech corpora from 15 languages. CIPAL gradually discovered words in each language, with cross-linguistic variation in performance. The model's average processing time also improved with experience, resembling the developmental changes observed in children's speed of processing. Second, we showed that CIPAL could simulate seven influential effects reported in statistical learning experiments with artificial languages. This included a preference for words over nonwords, part words, frequency-matched part words, phantom words, and sublexical units. On this basis, we argue that incremental chunking is an effective implicit statistical learning mechanism that may be central to children's vocabulary development. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
期刊介绍:
Psychological Review publishes articles that make important theoretical contributions to any area of scientific psychology, including systematic evaluation of alternative theories.