新数据收集的优先事项

IF 3.2 1区心理学 Q2 PSYCHOLOGY, DEVELOPMENTAL

Developmental Science Pub Date : 2025-09-05 DOI:10.1111/desc.70072

Brian MacWhinney, Catherine Snow

{"title":"新数据收集的优先事项","authors":"Brian MacWhinney, Catherine Snow","doi":"10.1111/desc.70072","DOIUrl":null,"url":null,"abstract":"Schaff, Loukatou, Cristia, and Havron (SLC&H) have contributed a fascinating and important analysis of the demographic characteristics of the child language data currently available in the CHILDES database. They were able to supplement information already on the web by soliciting further specifics from many of the original data contributors. They have identified biases in the representation of urbanization, family structure, SES, languages studied, countries represented, and multilingualism. These biases in the availability of data from rural, non-Western, low-education participants speaking non-Indo-European languages raise concerns when drawing conclusions about universality of phenomena, echoing widespread worries within psychology, sociology, and education about the dominance in research studies of data gathered only from WEIRD (Western, educated, industrialized, rich, and democratic) populations (Henrich et al. 2010).Child language data had an even more extreme bias in the 1970s, when the bulk of our transcript data came from typically developing children of English-speaking academics, often in the northeastern United States. Since then, the coverage has broadened greatly to include data from 48 languages, variations in SES, and a rich collection of types of multilingualism. Despite this growth in coverage, the database can never be truly representative of all the patterns of variation in the 2.2 billion children on the planet. This is because it would be difficult to attain fully representative coverage. Despite improvements in recording technology (LENA), automatic speech recognition (Liu et al. 2023), natural language processing (Liu and MacWhinney 2024), GenAI (Warstadt and Bowman 2022), and corpus linguistics (Baayen 2010), the collection and analysis of child language samples remains a daunting task. Barriers to data collection include privacy restrictions, researchers who are unwilling to share their data, restrictive IRB policies, lack of recognition for corpus work, logistical problems in rural areas, the need to rely on translators, and scarcity of research support. Given these limitations, the goal of eliminating the gaps so as to produce a fully balanced representation seems unattainable, at least in the near term.Fortunately, we can make productive use of the gaps and biases identified by SLC&H to guide our research. We can do this by focusing on the contrasts between universals and variation in language acquisition. This line of research begins by first proposing some universal and then collecting data that could falsify the universal. For example, SLC&H point to studies evaluating the universality of the noun bias, late passive acquisition, reduced parental input in rural communities, variations in gesture typology, or the effects of early bilingualism. In each of these areas, a universal is proposed based on evidence from current corpora, and then further data is collected that either confirms or falsifies the universal.Consider the case of the noun bias described by Gentner (2006). Studies based on samples such as the three children in Brown (1973) do indeed show an early noun bias for the English of children of educated parents in the Boston area when sampled during interviews recorded by graduate students. However, as shown by Sugárné (1970) for Hungarian, the use of verbs increases markedly and surpasses nouns when children are recorded on the playground. Moreover, as Ninio and Snow (1988) have shown, early vocabulary is rich in socially mediated terms that lie outside the noun-verb contrast. When we turn to languages outside of Indo-European, such as Chinese, Korean, or Mayan, we can see a reversal of the noun bias. Thus, both activity and language impact this feature of early vocabularies, suggesting that it may be important to explore the further effects of activity types as well as urbanization, SES, and birth order on this pattern.To cite another example, using data in CHILDES (Gleason and Ely 1997; Gleason and Greif 1983) compared the lexicon used in interactions with mothers, with fathers, and over the dinner table and found a great amount of non-overlap between these situations. Lexical non-overlap has also been documented for children learning two languages (Yip and Matthews 2007) that are used in very different settings. Although not included in this survey, language disabilities also have enormous and varied impacts on both the overall course and the details of language acquisition (Bishop 1997; Guendouzi et al. 2011).We can also propose and test universals regarding language teaching methods. WEIRD parents rely on elaborations and recasts to promote children's learning (Sokolov 1993). However, Schieffelin (1985) found that Kaluli mothers relied instead on asking children to repeat phrases after them. Studies of non-Western and rural cultures have shown that they can vary markedly in their use of praise, teasing, emotion terms, honorifics, and other routines. Even more extreme differences in parental output have been documented for groups such as the Navajo or Maya, in which direct parental input to young children is often minimal (Scollon 1976).Examples of this type could be multiplied dozens of times. However, what is missing in these reports are the detailed transcriptions of real-life interactions that would allow us to understand these patterns in greater detail. We have no shared transcript data from Kaluli, Mayan, Navajo, or Samoan that would allow us to track the effects of these variations in input. However, there are areas where such data does exist. For example, Gleason's recordings of mother, father, and dinner table talk are in CHILDES, and her published results on lexical non-overlap can be traced in further detail, as can the Yip and Matthews recordings of their bilingual subjects. For SES and ethnic group contrasts, one can look at the transcripts and audio from the Harvard HSLLD (Home-School Study of Language and Literacy Development) and a series of 12 papers analyzing these patterns. This gives us a rich picture of these contrasts in the Boston area, and we can then ask about what would be the results of a similar study conducted in Marseille, Manchester, Mombasa, Mumbai, or Mannheim. Data from rural populations and special areas could be particularly informative. For this, the representation of American, English-speaking children growing up in rural families who are eligible by family income for Head Start will increase with the imminent release of transcripts from the Early Head Start Project (Pan et al. 2005). We can study alternative patterns of language loss and maintenance as indigenous communities become increasingly linked to the global economy.To maximize our ability to understand these patterns of variation or universality, we need to create language sampling protocols that allow for cross-linguistic comparison. An example of such an effort is the Global Tales (https://talkbank.org/childes/access/GlobalTales/) project that asks children in the age range between 3 and 6 to tell stories about times when they were either happy, confused, angry, or proud, or when they had to deal with a situation that was either problematic or important. These same questions are being asked by researchers working with children from 25 countries and languages. The results so far demonstrate both variation and universality in the nature of the stories children tell. Most of the data collected so far is from middle-class children in urban settings, and adding data from rural populations and across SES levels is an important goal. Other projects working on cross-cultural and cross-linguistic comparisons include Acquisition Sketch, LITMUS, LaCoLa, Frog Stories, and PLAY.We can study universals and variation using comparisons across demographic variables. However, we also need to consider the role of individual variation in patterns of acquisition. For example, Peters (1977) contrasted children with precise articulation and those with “mush mouth”. Nelson (1973) contrasted referential and expressive children—a contrast that was then echoed in Bloom et al. (2001). Nelson (1981) further notes that children may shift from one acquisitional strategy to another across time. To examine strategies and processes in detail, Lieven, Tomasello, and colleagues collected densely sampled corpora for English, Finnish, and German. Using such data, they were able to show that, even on the level of argument structures for the English articles, acquisition is highly lexically specific, rather than driven by universal featural structures (Lieven et al. 1997).Looking back across the 50-plus years since the publication of Brown (1973), we can marvel at the growth in the availability of data on child language acquisition: from a set of transcripts from three children produced on mimeographed sheets to a world with data on thousands of children across 48 languages linked to terabytes of media. Of course, every glass in science is always half empty, and we are always striving for a fuller understanding, but it is heartening to know how much progress has been made. The careful work by SLC&H advances us still further by serving as a guide for new comparisons and by suggesting priorities for new data collection.The authors declare no conflicts of interest.","PeriodicalId":48392,"journal":{"name":"Developmental Science","volume":"28 6","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/desc.70072","citationCount":"0","resultStr":"{\"title\":\"Priorities for New Data Collection\",\"authors\":\"Brian MacWhinney, Catherine Snow\",\"doi\":\"10.1111/desc.70072\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Schaff, Loukatou, Cristia, and Havron (SLC&H) have contributed a fascinating and important analysis of the demographic characteristics of the child language data currently available in the CHILDES database. They were able to supplement information already on the web by soliciting further specifics from many of the original data contributors. They have identified biases in the representation of urbanization, family structure, SES, languages studied, countries represented, and multilingualism. These biases in the availability of data from rural, non-Western, low-education participants speaking non-Indo-European languages raise concerns when drawing conclusions about universality of phenomena, echoing widespread worries within psychology, sociology, and education about the dominance in research studies of data gathered only from WEIRD (Western, educated, industrialized, rich, and democratic) populations (Henrich et al. 2010).Child language data had an even more extreme bias in the 1970s, when the bulk of our transcript data came from typically developing children of English-speaking academics, often in the northeastern United States. Since then, the coverage has broadened greatly to include data from 48 languages, variations in SES, and a rich collection of types of multilingualism. Despite this growth in coverage, the database can never be truly representative of all the patterns of variation in the 2.2 billion children on the planet. This is because it would be difficult to attain fully representative coverage. Despite improvements in recording technology (LENA), automatic speech recognition (Liu et al. 2023), natural language processing (Liu and MacWhinney 2024), GenAI (Warstadt and Bowman 2022), and corpus linguistics (Baayen 2010), the collection and analysis of child language samples remains a daunting task. Barriers to data collection include privacy restrictions, researchers who are unwilling to share their data, restrictive IRB policies, lack of recognition for corpus work, logistical problems in rural areas, the need to rely on translators, and scarcity of research support. Given these limitations, the goal of eliminating the gaps so as to produce a fully balanced representation seems unattainable, at least in the near term.Fortunately, we can make productive use of the gaps and biases identified by SLC&H to guide our research. We can do this by focusing on the contrasts between universals and variation in language acquisition. This line of research begins by first proposing some universal and then collecting data that could falsify the universal. For example, SLC&H point to studies evaluating the universality of the noun bias, late passive acquisition, reduced parental input in rural communities, variations in gesture typology, or the effects of early bilingualism. In each of these areas, a universal is proposed based on evidence from current corpora, and then further data is collected that either confirms or falsifies the universal.Consider the case of the noun bias described by Gentner (2006). Studies based on samples such as the three children in Brown (1973) do indeed show an early noun bias for the English of children of educated parents in the Boston area when sampled during interviews recorded by graduate students. However, as shown by Sugárné (1970) for Hungarian, the use of verbs increases markedly and surpasses nouns when children are recorded on the playground. Moreover, as Ninio and Snow (1988) have shown, early vocabulary is rich in socially mediated terms that lie outside the noun-verb contrast. When we turn to languages outside of Indo-European, such as Chinese, Korean, or Mayan, we can see a reversal of the noun bias. Thus, both activity and language impact this feature of early vocabularies, suggesting that it may be important to explore the further effects of activity types as well as urbanization, SES, and birth order on this pattern.To cite another example, using data in CHILDES (Gleason and Ely 1997; Gleason and Greif 1983) compared the lexicon used in interactions with mothers, with fathers, and over the dinner table and found a great amount of non-overlap between these situations. Lexical non-overlap has also been documented for children learning two languages (Yip and Matthews 2007) that are used in very different settings. Although not included in this survey, language disabilities also have enormous and varied impacts on both the overall course and the details of language acquisition (Bishop 1997; Guendouzi et al. 2011).We can also propose and test universals regarding language teaching methods. WEIRD parents rely on elaborations and recasts to promote children's learning (Sokolov 1993). However, Schieffelin (1985) found that Kaluli mothers relied instead on asking children to repeat phrases after them. Studies of non-Western and rural cultures have shown that they can vary markedly in their use of praise, teasing, emotion terms, honorifics, and other routines. Even more extreme differences in parental output have been documented for groups such as the Navajo or Maya, in which direct parental input to young children is often minimal (Scollon 1976).Examples of this type could be multiplied dozens of times. However, what is missing in these reports are the detailed transcriptions of real-life interactions that would allow us to understand these patterns in greater detail. We have no shared transcript data from Kaluli, Mayan, Navajo, or Samoan that would allow us to track the effects of these variations in input. However, there are areas where such data does exist. For example, Gleason's recordings of mother, father, and dinner table talk are in CHILDES, and her published results on lexical non-overlap can be traced in further detail, as can the Yip and Matthews recordings of their bilingual subjects. For SES and ethnic group contrasts, one can look at the transcripts and audio from the Harvard HSLLD (Home-School Study of Language and Literacy Development) and a series of 12 papers analyzing these patterns. This gives us a rich picture of these contrasts in the Boston area, and we can then ask about what would be the results of a similar study conducted in Marseille, Manchester, Mombasa, Mumbai, or Mannheim. Data from rural populations and special areas could be particularly informative. For this, the representation of American, English-speaking children growing up in rural families who are eligible by family income for Head Start will increase with the imminent release of transcripts from the Early Head Start Project (Pan et al. 2005). We can study alternative patterns of language loss and maintenance as indigenous communities become increasingly linked to the global economy.To maximize our ability to understand these patterns of variation or universality, we need to create language sampling protocols that allow for cross-linguistic comparison. An example of such an effort is the Global Tales (https://talkbank.org/childes/access/GlobalTales/) project that asks children in the age range between 3 and 6 to tell stories about times when they were either happy, confused, angry, or proud, or when they had to deal with a situation that was either problematic or important. These same questions are being asked by researchers working with children from 25 countries and languages. The results so far demonstrate both variation and universality in the nature of the stories children tell. Most of the data collected so far is from middle-class children in urban settings, and adding data from rural populations and across SES levels is an important goal. Other projects working on cross-cultural and cross-linguistic comparisons include Acquisition Sketch, LITMUS, LaCoLa, Frog Stories, and PLAY.We can study universals and variation using comparisons across demographic variables. However, we also need to consider the role of individual variation in patterns of acquisition. For example, Peters (1977) contrasted children with precise articulation and those with “mush mouth”. Nelson (1973) contrasted referential and expressive children—a contrast that was then echoed in Bloom et al. (2001). Nelson (1981) further notes that children may shift from one acquisitional strategy to another across time. To examine strategies and processes in detail, Lieven, Tomasello, and colleagues collected densely sampled corpora for English, Finnish, and German. Using such data, they were able to show that, even on the level of argument structures for the English articles, acquisition is highly lexically specific, rather than driven by universal featural structures (Lieven et al. 1997).Looking back across the 50-plus years since the publication of Brown (1973), we can marvel at the growth in the availability of data on child language acquisition: from a set of transcripts from three children produced on mimeographed sheets to a world with data on thousands of children across 48 languages linked to terabytes of media. Of course, every glass in science is always half empty, and we are always striving for a fuller understanding, but it is heartening to know how much progress has been made. The careful work by SLC&H advances us still further by serving as a guide for new comparisons and by suggesting priorities for new data collection.The authors declare no conflicts of interest.\",\"PeriodicalId\":48392,\"journal\":{\"name\":\"Developmental Science\",\"volume\":\"28 6\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/desc.70072\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Developmental Science\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/desc.70072\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PSYCHOLOGY, DEVELOPMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Developmental Science","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/desc.70072","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, DEVELOPMENTAL","Score":null,"Total":0}

引用次数: 0

摘要

在纳瓦霍族或玛雅族等群体中，父母对年幼子女的直接输入通常是最小的，记录了父母输出的更极端差异（Scollon 1976）。这种类型的例子可以增加几十倍。然而，这些报告中缺少的是现实生活中相互作用的详细转录，这将使我们更详细地了解这些模式。我们没有来自卡卢里语、玛雅语、纳瓦霍语或萨摩亚语的共享文本数据，这些数据将使我们能够跟踪这些输入变化的影响。然而，在某些领域确实存在这样的数据。例如，Gleason对母亲、父亲和餐桌谈话的记录在CHILDES中，她发表的关于词汇不重叠的结果可以更详细地追溯到，Yip和Matthews对他们的双语受试者的记录也是如此。对于社会经济地位和种族群体的对比，人们可以看看哈佛HSLLD（家庭-学校语言和读写能力发展研究）的成绩单和音频，以及分析这些模式的一系列12篇论文。这让我们对波士顿地区的这些差异有了一个丰富的了解，然后我们可以问，在马赛、曼彻斯特、蒙巴萨、孟买或曼海姆进行类似的研究会有什么结果。来自农村人口和特殊地区的数据可能特别有用。因此，随着Early Head Start项目（Pan et al. 2005）的成绩单即将发布，在农村家庭中长大的美国英语儿童的代表性将会增加，这些儿童的家庭收入有资格获得Head Start。随着土著社区与全球经济的联系日益紧密，我们可以研究语言丧失和维持的其他模式。为了最大限度地提高我们理解这些变化模式或普遍性的能力，我们需要创建允许跨语言比较的语言采样协议。这种努力的一个例子是全球故事（https://talkbank.org/childes/access/GlobalTales/）项目，该项目要求3至6岁的儿童讲述他们快乐、困惑、愤怒或自豪的时刻，或者当他们不得不处理一个问题或重要的情况时。研究来自25个国家和语言的儿童的研究人员也提出了同样的问题。迄今为止的研究结果表明，儿童所讲故事的性质既有差异性，也有普遍性。到目前为止收集的大部分数据来自城市环境中的中产阶级儿童，增加农村人口和跨社会经济地位的数据是一个重要目标。其他研究跨文化和跨语言比较的项目包括Acquisition Sketch、LITMUS、LaCoLa、Frog Stories和PLAY。我们可以通过人口统计变量的比较来研究共性和差异。然而，我们也需要考虑个体差异在习得模式中的作用。例如，Peters（1977）对比了发音准确的儿童和“口齿不清”的儿童。Nelson（1973）对比了参照型儿童和表现型儿童——Bloom等人（2001）也呼应了这一对比。Nelson（1981）进一步指出，随着时间的推移，儿童可能会从一种获取策略转向另一种。为了详细检查策略和流程，Lieven、Tomasello和同事收集了英语、芬兰语和德语的密集样本语料库。利用这些数据，他们能够证明，即使在英语文章的论点结构层面上，习得也是高度词汇特异性的，而不是由普遍的特征结构驱动的（Lieven et al. 1997）。回顾自布朗（1973）发表以来的50多年，我们可以惊叹于儿童语言习得数据的增长：从一组用油印纸制作的三个孩子的成绩单，到一个拥有48种语言的数千名儿童数据的世界，这些数据与tb级的媒体相关联。当然，科学中的每只杯子都是半空的，我们总是在努力寻求更全面的了解，但知道已经取得了多大的进展是令人鼓舞的。SLC&amp；H的细致工作通过作为新的比较指南和建议新数据收集的优先事项，进一步推动了我们。作者声明无利益冲突。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Priorities for New Data Collection

Schaff, Loukatou, Cristia, and Havron (SLC&H) have contributed a fascinating and important analysis of the demographic characteristics of the child language data currently available in the CHILDES database. They were able to supplement information already on the web by soliciting further specifics from many of the original data contributors. They have identified biases in the representation of urbanization, family structure, SES, languages studied, countries represented, and multilingualism. These biases in the availability of data from rural, non-Western, low-education participants speaking non-Indo-European languages raise concerns when drawing conclusions about universality of phenomena, echoing widespread worries within psychology, sociology, and education about the dominance in research studies of data gathered only from WEIRD (Western, educated, industrialized, rich, and democratic) populations (Henrich et al. 2010).

Child language data had an even more extreme bias in the 1970s, when the bulk of our transcript data came from typically developing children of English-speaking academics, often in the northeastern United States. Since then, the coverage has broadened greatly to include data from 48 languages, variations in SES, and a rich collection of types of multilingualism. Despite this growth in coverage, the database can never be truly representative of all the patterns of variation in the 2.2 billion children on the planet. This is because it would be difficult to attain fully representative coverage. Despite improvements in recording technology (LENA), automatic speech recognition (Liu et al. 2023), natural language processing (Liu and MacWhinney 2024), GenAI (Warstadt and Bowman 2022), and corpus linguistics (Baayen 2010), the collection and analysis of child language samples remains a daunting task. Barriers to data collection include privacy restrictions, researchers who are unwilling to share their data, restrictive IRB policies, lack of recognition for corpus work, logistical problems in rural areas, the need to rely on translators, and scarcity of research support. Given these limitations, the goal of eliminating the gaps so as to produce a fully balanced representation seems unattainable, at least in the near term.

Fortunately, we can make productive use of the gaps and biases identified by SLC&H to guide our research. We can do this by focusing on the contrasts between universals and variation in language acquisition. This line of research begins by first proposing some universal and then collecting data that could falsify the universal. For example, SLC&H point to studies evaluating the universality of the noun bias, late passive acquisition, reduced parental input in rural communities, variations in gesture typology, or the effects of early bilingualism. In each of these areas, a universal is proposed based on evidence from current corpora, and then further data is collected that either confirms or falsifies the universal.

Consider the case of the noun bias described by Gentner (2006). Studies based on samples such as the three children in Brown (1973) do indeed show an early noun bias for the English of children of educated parents in the Boston area when sampled during interviews recorded by graduate students. However, as shown by Sugárné (1970) for Hungarian, the use of verbs increases markedly and surpasses nouns when children are recorded on the playground. Moreover, as Ninio and Snow (1988) have shown, early vocabulary is rich in socially mediated terms that lie outside the noun-verb contrast. When we turn to languages outside of Indo-European, such as Chinese, Korean, or Mayan, we can see a reversal of the noun bias. Thus, both activity and language impact this feature of early vocabularies, suggesting that it may be important to explore the further effects of activity types as well as urbanization, SES, and birth order on this pattern.

To cite another example, using data in CHILDES (Gleason and Ely 1997; Gleason and Greif 1983) compared the lexicon used in interactions with mothers, with fathers, and over the dinner table and found a great amount of non-overlap between these situations. Lexical non-overlap has also been documented for children learning two languages (Yip and Matthews 2007) that are used in very different settings. Although not included in this survey, language disabilities also have enormous and varied impacts on both the overall course and the details of language acquisition (Bishop 1997; Guendouzi et al. 2011).

We can also propose and test universals regarding language teaching methods. WEIRD parents rely on elaborations and recasts to promote children's learning (Sokolov 1993). However, Schieffelin (1985) found that Kaluli mothers relied instead on asking children to repeat phrases after them. Studies of non-Western and rural cultures have shown that they can vary markedly in their use of praise, teasing, emotion terms, honorifics, and other routines. Even more extreme differences in parental output have been documented for groups such as the Navajo or Maya, in which direct parental input to young children is often minimal (Scollon 1976).

Examples of this type could be multiplied dozens of times. However, what is missing in these reports are the detailed transcriptions of real-life interactions that would allow us to understand these patterns in greater detail. We have no shared transcript data from Kaluli, Mayan, Navajo, or Samoan that would allow us to track the effects of these variations in input. However, there are areas where such data does exist. For example, Gleason's recordings of mother, father, and dinner table talk are in CHILDES, and her published results on lexical non-overlap can be traced in further detail, as can the Yip and Matthews recordings of their bilingual subjects. For SES and ethnic group contrasts, one can look at the transcripts and audio from the Harvard HSLLD (Home-School Study of Language and Literacy Development) and a series of 12 papers analyzing these patterns. This gives us a rich picture of these contrasts in the Boston area, and we can then ask about what would be the results of a similar study conducted in Marseille, Manchester, Mombasa, Mumbai, or Mannheim. Data from rural populations and special areas could be particularly informative. For this, the representation of American, English-speaking children growing up in rural families who are eligible by family income for Head Start will increase with the imminent release of transcripts from the Early Head Start Project (Pan et al. 2005). We can study alternative patterns of language loss and maintenance as indigenous communities become increasingly linked to the global economy.

To maximize our ability to understand these patterns of variation or universality, we need to create language sampling protocols that allow for cross-linguistic comparison. An example of such an effort is the Global Tales (https://talkbank.org/childes/access/GlobalTales/) project that asks children in the age range between 3 and 6 to tell stories about times when they were either happy, confused, angry, or proud, or when they had to deal with a situation that was either problematic or important. These same questions are being asked by researchers working with children from 25 countries and languages. The results so far demonstrate both variation and universality in the nature of the stories children tell. Most of the data collected so far is from middle-class children in urban settings, and adding data from rural populations and across SES levels is an important goal. Other projects working on cross-cultural and cross-linguistic comparisons include Acquisition Sketch, LITMUS, LaCoLa, Frog Stories, and PLAY.

We can study universals and variation using comparisons across demographic variables. However, we also need to consider the role of individual variation in patterns of acquisition. For example, Peters (1977) contrasted children with precise articulation and those with “mush mouth”. Nelson (1973) contrasted referential and expressive children—a contrast that was then echoed in Bloom et al. (2001). Nelson (1981) further notes that children may shift from one acquisitional strategy to another across time. To examine strategies and processes in detail, Lieven, Tomasello, and colleagues collected densely sampled corpora for English, Finnish, and German. Using such data, they were able to show that, even on the level of argument structures for the English articles, acquisition is highly lexically specific, rather than driven by universal featural structures (Lieven et al. 1997).

Looking back across the 50-plus years since the publication of Brown (1973), we can marvel at the growth in the availability of data on child language acquisition: from a set of transcripts from three children produced on mimeographed sheets to a world with data on thousands of children across 48 languages linked to terabytes of media. Of course, every glass in science is always half empty, and we are always striving for a fuller understanding, but it is heartening to know how much progress has been made. The careful work by SLC&H advances us still further by serving as a guide for new comparisons and by suggesting priorities for new data collection.

The authors declare no conflicts of interest.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Developmental Science Multiple-

CiteScore

8.10

自引率

8.10%

发文量

132

期刊介绍： Developmental Science publishes cutting-edge theory and up-to-the-minute research on scientific developmental psychology from leading thinkers in the field. It is currently the only journal that specifically focuses on human developmental cognitive neuroscience. Coverage includes: - Clinical, computational and comparative approaches to development - Key advances in cognitive and social development - Developmental cognitive neuroscience - Functional neuroimaging of the developing brain