On the Robust Measurement of Inflectional Diversity

Aris Xanthos, Guillaume Guex
{"title":"On the Robust Measurement of Inflectional Diversity","authors":"Aris Xanthos, Guillaume Guex","doi":"10.1515/9783110420296-020","DOIUrl":null,"url":null,"abstract":"Lexical diversity measures are notoriously sensitive to variations of sample size and recent approaches to this issue typically involve the computation of the average variety of lexical units in random subsamples of fixed size. This methodology has been further extended to measures of inflectional diversity such as the average number of wordforms per lexeme, also known as the mean size of paradigm (MSP) index. In this contribution we argue that, while random sampling can indeed be used to increase the robustness of inflectional diversity measures, using a fixed subsample size is only justified under the hypothesis that the corpora that we compare have the same degree of lexematic diversity. In the more general case where they may have differing degrees of lexematic diversity, a more sophisticated strategy can and should be adopted. A novel approach to the measurement of inflectional diversity is proposed, aiming to cope not only with variations of sample size, but also with variations of lexematic diversity. The robustness of this new method is empirically assessed and the results show that while there is still room for improvement, the proposed methodology considerably attenuates the impact of lexematic diversity discrepancies on the measurement of inflectional diversity.","PeriodicalId":426263,"journal":{"name":"Recent Contributions to Quantitative Linguistics","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Contributions to Quantitative Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/9783110420296-020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Lexical diversity measures are notoriously sensitive to variations of sample size and recent approaches to this issue typically involve the computation of the average variety of lexical units in random subsamples of fixed size. This methodology has been further extended to measures of inflectional diversity such as the average number of wordforms per lexeme, also known as the mean size of paradigm (MSP) index. In this contribution we argue that, while random sampling can indeed be used to increase the robustness of inflectional diversity measures, using a fixed subsample size is only justified under the hypothesis that the corpora that we compare have the same degree of lexematic diversity. In the more general case where they may have differing degrees of lexematic diversity, a more sophisticated strategy can and should be adopted. A novel approach to the measurement of inflectional diversity is proposed, aiming to cope not only with variations of sample size, but also with variations of lexematic diversity. The robustness of this new method is empirically assessed and the results show that while there is still room for improvement, the proposed methodology considerably attenuates the impact of lexematic diversity discrepancies on the measurement of inflectional diversity.
关于屈折多样性的稳健测量
众所周知,词汇多样性测量对样本大小的变化非常敏感,最近解决这个问题的方法通常涉及计算固定大小的随机子样本中词汇单位的平均变化。该方法已进一步扩展到屈折变化多样性的测量,如每个词素的平均词形数,也称为范式的平均大小(MSP)指数。在这篇文章中,我们认为,虽然随机抽样确实可以用来增加屈折多样性测量的鲁棒性,但只有在我们比较的语料库具有相同程度的词汇多样性的假设下,使用固定的子样本量才是合理的。在更一般的情况下,它们可能有不同程度的词汇多样性,可以而且应该采用更复杂的策略。本文提出了一种测量屈折变化多样性的新方法,该方法不仅可以处理样本量的变化,而且可以处理词汇多样性的变化。该方法的稳健性得到了实证评估,结果表明,虽然仍有改进的空间,但该方法大大减弱了词汇多样性差异对屈折变化多样性测量的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信