使用 LENA 和 ACLEW 管道建立从长篇录音中提取的指标的可靠性。

IF 5.4 3区 材料科学 Q2 CHEMISTRY, PHYSICAL
ACS Applied Energy Materials Pub Date : 2024-12-01 Epub Date: 2024-09-20 DOI:10.3758/s13428-024-02493-2
Alejandrina Cristia, Lucas Gautheron, Zixing Zhang, Björn Schuller, Camila Scaff, Caroline Rowland, Okko Räsänen, Loann Peurey, Marvin Lavechin, William Havard, Caitlin M Fausey, Margaret Cychosz, Elika Bergelson, Heather Anderson, Najla Al Futaisi, Melanie Soderstrom
{"title":"使用 LENA 和 ACLEW 管道建立从长篇录音中提取的指标的可靠性。","authors":"Alejandrina Cristia, Lucas Gautheron, Zixing Zhang, Björn Schuller, Camila Scaff, Caroline Rowland, Okko Räsänen, Loann Peurey, Marvin Lavechin, William Havard, Caitlin M Fausey, Margaret Cychosz, Elika Bergelson, Heather Anderson, Najla Al Futaisi, Melanie Soderstrom","doi":"10.3758/s13428-024-02493-2","DOIUrl":null,"url":null,"abstract":"<p><p>Long-form audio recordings are increasingly used to study individual variation, group differences, and many other topics in theoretical and applied fields of developmental science, particularly for the description of children's language input (typically speech from adults) and children's language output (ranging from babble to sentences). The proprietary LENA software has been available for over a decade, and with it, users have come to rely on derived metrics like adult word count (AWC) and child vocalization counts (CVC), which have also more recently been derived using an open-source alternative, the ACLEW pipeline. Yet, there is relatively little work assessing the reliability of long-form metrics in terms of the stability of individual differences across time. Filling this gap, we analyzed eight spoken-language datasets: four from North American English-learning infants, and one each from British English-, French-, American English-/Spanish-, and Quechua-/Spanish-learning infants. The audio data were analyzed using two types of processing software: LENA and the ACLEW open-source pipeline. When all corpora were included, we found relatively low to moderate reliability (across multiple recordings, intraclass correlation coefficient attributed to the child identity [Child ICC], was < 50% for most metrics). There were few differences between the two pipelines. Exploratory analyses suggested some differences as a function of child age and corpora. These findings suggest that, while reliability is likely sufficient for various group-level analyses, caution is needed when using either LENA or ACLEW tools to study individual variation. We also encourage improvement of extant tools, specifically targeting accurate measurement of individual variation.</p>","PeriodicalId":4,"journal":{"name":"ACS Applied Energy Materials","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline.\",\"authors\":\"Alejandrina Cristia, Lucas Gautheron, Zixing Zhang, Björn Schuller, Camila Scaff, Caroline Rowland, Okko Räsänen, Loann Peurey, Marvin Lavechin, William Havard, Caitlin M Fausey, Margaret Cychosz, Elika Bergelson, Heather Anderson, Najla Al Futaisi, Melanie Soderstrom\",\"doi\":\"10.3758/s13428-024-02493-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Long-form audio recordings are increasingly used to study individual variation, group differences, and many other topics in theoretical and applied fields of developmental science, particularly for the description of children's language input (typically speech from adults) and children's language output (ranging from babble to sentences). The proprietary LENA software has been available for over a decade, and with it, users have come to rely on derived metrics like adult word count (AWC) and child vocalization counts (CVC), which have also more recently been derived using an open-source alternative, the ACLEW pipeline. Yet, there is relatively little work assessing the reliability of long-form metrics in terms of the stability of individual differences across time. Filling this gap, we analyzed eight spoken-language datasets: four from North American English-learning infants, and one each from British English-, French-, American English-/Spanish-, and Quechua-/Spanish-learning infants. The audio data were analyzed using two types of processing software: LENA and the ACLEW open-source pipeline. When all corpora were included, we found relatively low to moderate reliability (across multiple recordings, intraclass correlation coefficient attributed to the child identity [Child ICC], was < 50% for most metrics). There were few differences between the two pipelines. Exploratory analyses suggested some differences as a function of child age and corpora. These findings suggest that, while reliability is likely sufficient for various group-level analyses, caution is needed when using either LENA or ACLEW tools to study individual variation. We also encourage improvement of extant tools, specifically targeting accurate measurement of individual variation.</p>\",\"PeriodicalId\":4,\"journal\":{\"name\":\"ACS Applied Energy Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Energy Materials\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.3758/s13428-024-02493-2\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Energy Materials","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-024-02493-2","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

摘要

长篇录音越来越多地被用于研究个体差异、群体差异以及发育科学理论和应用领域的许多其他课题,特别是用于描述儿童的语言输入(通常是成人的讲话)和儿童的语言输出(从咿呀学语到句子)。专有的 LENA 软件已问世十多年,用户已开始依赖成人词数(AWC)和儿童发声数(CVC)等衍生指标。然而,就个体差异在不同时期的稳定性而言,评估长式指标可靠性的工作相对较少。为了填补这一空白,我们分析了八个口语数据集:四个数据集来自学习北美英语的婴儿,另一个数据集来自学习英国英语、法语、美国英语/西班牙语和克丘亚语/西班牙语的婴儿。音频数据使用两种处理软件进行分析:LENA 和 ACLEW 开源管道。当包含所有语料库时,我们发现了相对较低到中等的可靠性(在多个录音中,归因于儿童身份的类内相关系数 [Child ICC] 为
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline.

Long-form audio recordings are increasingly used to study individual variation, group differences, and many other topics in theoretical and applied fields of developmental science, particularly for the description of children's language input (typically speech from adults) and children's language output (ranging from babble to sentences). The proprietary LENA software has been available for over a decade, and with it, users have come to rely on derived metrics like adult word count (AWC) and child vocalization counts (CVC), which have also more recently been derived using an open-source alternative, the ACLEW pipeline. Yet, there is relatively little work assessing the reliability of long-form metrics in terms of the stability of individual differences across time. Filling this gap, we analyzed eight spoken-language datasets: four from North American English-learning infants, and one each from British English-, French-, American English-/Spanish-, and Quechua-/Spanish-learning infants. The audio data were analyzed using two types of processing software: LENA and the ACLEW open-source pipeline. When all corpora were included, we found relatively low to moderate reliability (across multiple recordings, intraclass correlation coefficient attributed to the child identity [Child ICC], was < 50% for most metrics). There were few differences between the two pipelines. Exploratory analyses suggested some differences as a function of child age and corpora. These findings suggest that, while reliability is likely sufficient for various group-level analyses, caution is needed when using either LENA or ACLEW tools to study individual variation. We also encourage improvement of extant tools, specifically targeting accurate measurement of individual variation.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACS Applied Energy Materials
ACS Applied Energy Materials Materials Science-Materials Chemistry
CiteScore
10.30
自引率
6.20%
发文量
1368
期刊介绍: ACS Applied Energy Materials is an interdisciplinary journal publishing original research covering all aspects of materials, engineering, chemistry, physics and biology relevant to energy conversion and storage. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrate knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important energy applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信