从婴儿找词研究中建立句子刺激分割计算模型

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS
Daniel Swingley, Robin Algayres
{"title":"从婴儿找词研究中建立句子刺激分割计算模型","authors":"Daniel Swingley,&nbsp;Robin Algayres","doi":"10.1111/cogs.13427","DOIUrl":null,"url":null,"abstract":"<p>Computational models of infant word-finding typically operate over transcriptions of infant-directed speech corpora. It is now possible to test models of word segmentation on speech materials, rather than transcriptions of speech. We propose that such modeling efforts be conducted over the speech of the experimental stimuli used in studies measuring infants' capacity for learning from spoken sentences. Correspondence with infant outcomes in such experiments is an appropriate benchmark for models of infants. We demonstrate such an analysis by applying the DP-Parser model of Algayres and colleagues to auditory stimuli used in infant psycholinguistic experiments by Pelucchi and colleagues. The DP-Parser model takes speech as input, and creates multiple overlapping embeddings from each utterance. Prospective words are identified as clusters of similar embedded segments. This allows segmentation of each utterance into possible words, using a dynamic programming method that maximizes the frequency of constituent segments. We show that DP-Parse mimics American English learners' performance in extracting words from Italian sentences, favoring the segmentation of words with high syllabic transitional probability. This kind of computational analysis over actual stimuli from infant experiments may be helpful in tuning future models to match human performance.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.13427","citationCount":"0","resultStr":"{\"title\":\"Computational Modeling of the Segmentation of Sentence Stimuli From an Infant Word-Finding Study\",\"authors\":\"Daniel Swingley,&nbsp;Robin Algayres\",\"doi\":\"10.1111/cogs.13427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Computational models of infant word-finding typically operate over transcriptions of infant-directed speech corpora. It is now possible to test models of word segmentation on speech materials, rather than transcriptions of speech. We propose that such modeling efforts be conducted over the speech of the experimental stimuli used in studies measuring infants' capacity for learning from spoken sentences. Correspondence with infant outcomes in such experiments is an appropriate benchmark for models of infants. We demonstrate such an analysis by applying the DP-Parser model of Algayres and colleagues to auditory stimuli used in infant psycholinguistic experiments by Pelucchi and colleagues. The DP-Parser model takes speech as input, and creates multiple overlapping embeddings from each utterance. Prospective words are identified as clusters of similar embedded segments. This allows segmentation of each utterance into possible words, using a dynamic programming method that maximizes the frequency of constituent segments. We show that DP-Parse mimics American English learners' performance in extracting words from Italian sentences, favoring the segmentation of words with high syllabic transitional probability. This kind of computational analysis over actual stimuli from infant experiments may be helpful in tuning future models to match human performance.</p>\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.13427\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cogs.13427\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cogs.13427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

摘要

婴儿找词的计算模型通常是在婴儿引导的语音库转录本上运行的。现在,我们有可能在语音材料而不是语音转录本上测试单词分段模型。我们建议在测量婴儿从口语句子中学习能力的研究中,对实验刺激的语音进行这种建模工作。与婴儿在此类实验中的结果相对应,是婴儿模型的适当基准。我们通过将 Algayres 及其同事的 DP-Parser 模型应用于 Pelucchi 及其同事的婴儿心理语言学实验中使用的听觉刺激,来演示这种分析。DP-Parser 模型将语音作为输入,并从每个语音中创建多个重叠嵌入。前瞻性词语被识别为相似嵌入片段的集群。这样就可以使用动态编程方法将每个语段分割成可能的词语,从而最大限度地提高组成语段的频率。我们的研究表明,DP-Parse 模拟了美式英语学习者从意大利语句子中提取单词的表现,有利于分割音节过渡概率高的单词。这种对婴儿实验中的实际刺激进行的计算分析可能有助于调整未来的模型,使之与人类的表现相匹配。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Computational Modeling of the Segmentation of Sentence Stimuli From an Infant Word-Finding Study

Computational Modeling of the Segmentation of Sentence Stimuli From an Infant Word-Finding Study

Computational models of infant word-finding typically operate over transcriptions of infant-directed speech corpora. It is now possible to test models of word segmentation on speech materials, rather than transcriptions of speech. We propose that such modeling efforts be conducted over the speech of the experimental stimuli used in studies measuring infants' capacity for learning from spoken sentences. Correspondence with infant outcomes in such experiments is an appropriate benchmark for models of infants. We demonstrate such an analysis by applying the DP-Parser model of Algayres and colleagues to auditory stimuli used in infant psycholinguistic experiments by Pelucchi and colleagues. The DP-Parser model takes speech as input, and creates multiple overlapping embeddings from each utterance. Prospective words are identified as clusters of similar embedded segments. This allows segmentation of each utterance into possible words, using a dynamic programming method that maximizes the frequency of constituent segments. We show that DP-Parse mimics American English learners' performance in extracting words from Italian sentences, favoring the segmentation of words with high syllabic transitional probability. This kind of computational analysis over actual stimuli from infant experiments may be helpful in tuning future models to match human performance.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信