Automatic parsing of parental verbal input.

Kenji Sagae, Brian MacWhinney, Alon Lavie
{"title":"Automatic parsing of parental verbal input.","authors":"Kenji Sagae,&nbsp;Brian MacWhinney,&nbsp;Alon Lavie","doi":"10.3758/bf03195557","DOIUrl":null,"url":null,"abstract":"<p><p>To evaluate theoretical proposals regarding the course of child language acquisition, researchers often need to rely on the processing of large numbers of syntactically parsed utterances, both from children and from their parents. Because it is so difficult to do this by hand, there are currently no parsed corpora of child language input data. To automate this process, we developed a system that combined the MOR tagger, a rule-based parser, and statistical disambiguation techniques. The resultant system obtained nearly 80% correct parses for the sentences spoken to children. To achieve this level, we had to construct a particular processing sequence that minimizes problems caused by the coverage/ambiguity tradeoff in parser design. These procedures are particularly appropriate for use with the CHILDES database, an international corpus of transcripts. The data and programs are now freely available over the Internet.</p>","PeriodicalId":79800,"journal":{"name":"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc","volume":"36 1","pages":"113-26"},"PeriodicalIF":0.0000,"publicationDate":"2004-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3758/bf03195557","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3758/bf03195557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

To evaluate theoretical proposals regarding the course of child language acquisition, researchers often need to rely on the processing of large numbers of syntactically parsed utterances, both from children and from their parents. Because it is so difficult to do this by hand, there are currently no parsed corpora of child language input data. To automate this process, we developed a system that combined the MOR tagger, a rule-based parser, and statistical disambiguation techniques. The resultant system obtained nearly 80% correct parses for the sentences spoken to children. To achieve this level, we had to construct a particular processing sequence that minimizes problems caused by the coverage/ambiguity tradeoff in parser design. These procedures are particularly appropriate for use with the CHILDES database, an international corpus of transcripts. The data and programs are now freely available over the Internet.

自动解析父母的口头输入。
为了评估关于儿童语言习得过程的理论建议,研究人员经常需要依赖于对大量语法分析过的话语的处理,这些话语既有来自儿童的,也有来自父母的。由于手工完成这项工作非常困难,目前还没有解析过的子语言输入数据语料库。为了使这个过程自动化,我们开发了一个系统,该系统结合了MOR标记器、基于规则的解析器和统计消歧技术。由此产生的系统对儿童所说的句子获得了近80%的正确解析。为了达到这个级别,我们必须构造一个特殊的处理序列,以最小化解析器设计中覆盖率/模糊性权衡所引起的问题。这些程序特别适合与国际抄本语料库CHILDES数据库一起使用。这些数据和程序现在可以在互联网上免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信