{"title":"The Missing Half of Language Learning in Current Developmental Language Models: Exogenous and Endogenous Linguistic Input.","authors":"Nan Zhao, Xufeng Duan, Zhenguang G Cai","doi":"10.1162/OPMI.a.33","DOIUrl":null,"url":null,"abstract":"<p><p>Developmental language models (DLMs) aim to replicate the efficiency of child language acquisition but often focus solely on the estimation of exogenous linguistic input. We argue that a child's linguistic growth is also critically shaped by endogenous processes, including (1) co-opting language in non-linguistic perception and cognition, (2) engaging in private and inner speech, and (3) benefiting from neural replay of linguistic information during sleep. These endogenous processes amplify and refine exogenous linguistic input in ways that current DLMs do not replicate. To align DLMs with child language acquisition, we propose redefining \"linguistic exposure\" to encompass both exogenous and endogenous linguistic input. By integrating label feedback, self-generated speech, and sleep-like consolidation, researchers can narrow the gap between artificial and human learning. Collaborations across machine learning, psychology, and linguistics will be essential to ground models in empirical data on child behavior and build DLMs that truly reflect the marvel of language acquisition.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"9 ","pages":"1543-1549"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12506926/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Mind","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/OPMI.a.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Developmental language models (DLMs) aim to replicate the efficiency of child language acquisition but often focus solely on the estimation of exogenous linguistic input. We argue that a child's linguistic growth is also critically shaped by endogenous processes, including (1) co-opting language in non-linguistic perception and cognition, (2) engaging in private and inner speech, and (3) benefiting from neural replay of linguistic information during sleep. These endogenous processes amplify and refine exogenous linguistic input in ways that current DLMs do not replicate. To align DLMs with child language acquisition, we propose redefining "linguistic exposure" to encompass both exogenous and endogenous linguistic input. By integrating label feedback, self-generated speech, and sleep-like consolidation, researchers can narrow the gap between artificial and human learning. Collaborations across machine learning, psychology, and linguistics will be essential to ground models in empirical data on child behavior and build DLMs that truly reflect the marvel of language acquisition.