Statistical bias control in typology

IF 1.7 2区 文学 0 LANGUAGE & LINGUISTICS
Matías Guzmán Naranjo, Laura Becker
{"title":"Statistical bias control in typology","authors":"Matías Guzmán Naranjo, Laura Becker","doi":"10.1515/lingty-2021-0002","DOIUrl":null,"url":null,"abstract":"Abstract In this paper, we propose two new statistical controls for genealogical and areal bias in typological samples. Our test case being the effect of VO-order effect on affix position (prefixation vs. suffixation), we show how statistical modeling including a phylogenetic regression term (phylogenetic control) and a two-dimensional Gaussian Process (areal control) can be used to capture genealogical and areal effects in a large but unbalanced sample. We find that, once these biases are controlled for, VO-order has no effect on affix position. Another important finding, which is in line with previous studies, is that areal effects are as important as genealogical effects, emphasizing the importance of areal or contact control in typological studies built on language samples. On the other hand, we also show that strict probability sampling is not required with the statistical controls that we propose, as long as the sample is a variety sample large enough to cover different areas and families. This has the crucial practical consequence that it allows us to include as much of the available information as possible, without the need to artificially restrict the sample and potentially lose otherwise available information.","PeriodicalId":45834,"journal":{"name":"Linguistic Typology","volume":"26 1","pages":"605 - 670"},"PeriodicalIF":1.7000,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Linguistic Typology","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1515/lingty-2021-0002","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 9

Abstract

Abstract In this paper, we propose two new statistical controls for genealogical and areal bias in typological samples. Our test case being the effect of VO-order effect on affix position (prefixation vs. suffixation), we show how statistical modeling including a phylogenetic regression term (phylogenetic control) and a two-dimensional Gaussian Process (areal control) can be used to capture genealogical and areal effects in a large but unbalanced sample. We find that, once these biases are controlled for, VO-order has no effect on affix position. Another important finding, which is in line with previous studies, is that areal effects are as important as genealogical effects, emphasizing the importance of areal or contact control in typological studies built on language samples. On the other hand, we also show that strict probability sampling is not required with the statistical controls that we propose, as long as the sample is a variety sample large enough to cover different areas and families. This has the crucial practical consequence that it allows us to include as much of the available information as possible, without the need to artificially restrict the sample and potentially lose otherwise available information.
类型学中的统计偏差控制
在本文中,我们提出了两种新的统计控制类型样本的宗谱和地区偏差。我们的测试用例是元音顺序效应对词根位置(前缀与后缀)的影响,我们展示了如何使用统计建模(包括系统发育回归项(系统发育控制)和二维高斯过程(区域控制))来捕获大型但不平衡样本中的谱系和区域效应。我们发现,一旦这些偏差得到控制,词缀顺序对词缀位置没有影响。另一个与先前研究一致的重要发现是,区域效应与系谱效应同样重要,强调了在基于语言样本的类型学研究中区域或接触控制的重要性。另一方面,我们也表明,我们提出的统计控制并不需要严格的概率抽样,只要样本是足够大的品种样本,可以覆盖不同的地区和家庭。这有一个重要的实际结果,它允许我们包含尽可能多的可用信息,而不需要人为地限制样本和潜在地丢失其他可用信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.70
自引率
5.00%
发文量
13
期刊介绍: Linguistic Typology provides a forum for all work of relevance to the study of language typology and cross-linguistic variation. It welcomes work taking a typological perspective on all domains of the structure of spoken and signed languages, including historical change, language processing, and sociolinguistics. Diverse descriptive and theoretical frameworks are welcomed so long as they have a clear bearing on the study of cross-linguistic variation. We welcome cross-disciplinary approaches to the study of linguistic diversity, as well as work dealing with just one or a few languages, as long as it is typologically informed and typologically and theoretically relevant, and contains new empirical evidence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信