The effects of type and token frequency on word length: a cross-linguistic study

Q1 Arts and Humanities
T. Berg, Peter Zörnig, Charlotte Lehr
{"title":"The effects of type and token frequency on word length: a cross-linguistic study","authors":"T. Berg, Peter Zörnig, Charlotte Lehr","doi":"10.1515/glot-2022-2007","DOIUrl":null,"url":null,"abstract":"Abstract Inspired by Zipf’s Law of Abbreviation, previous research was mostly directed at the interaction of word length and token frequency. Much less is known about the relationship of word length and type frequency, let alone about the differential impact of type and token frequency on word length. These issues are examined on the basis of a non-representative sample of 10 languages. The token frequency analysis reveals that 8 of the 10 languages show a monotonic decrease in frequency with increasing length while 2 languages reveal a unimodal distribution. By contrast, all 10 languages exhibit a rise followed by a monotonic drop of the frequency curve in the type frequency analysis. There appears to be a notable effect of type frequency on the nature of the token frequency distribution: the greater the average length of the words in the lexicon, the higher the probability of a unimodal distribution. Two principles are required to account for these results—a general dispreference for using long words and a language-particular dispreference for short words in the lexicon.","PeriodicalId":37792,"journal":{"name":"Glottotheory","volume":"13 1","pages":"173 - 209"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Glottotheory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/glot-2022-2007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract Inspired by Zipf’s Law of Abbreviation, previous research was mostly directed at the interaction of word length and token frequency. Much less is known about the relationship of word length and type frequency, let alone about the differential impact of type and token frequency on word length. These issues are examined on the basis of a non-representative sample of 10 languages. The token frequency analysis reveals that 8 of the 10 languages show a monotonic decrease in frequency with increasing length while 2 languages reveal a unimodal distribution. By contrast, all 10 languages exhibit a rise followed by a monotonic drop of the frequency curve in the type frequency analysis. There appears to be a notable effect of type frequency on the nature of the token frequency distribution: the greater the average length of the words in the lexicon, the higher the probability of a unimodal distribution. Two principles are required to account for these results—a general dispreference for using long words and a language-particular dispreference for short words in the lexicon.
类型和标记频率对词长影响的跨语言研究
摘要受齐普夫缩写定律的启发,以往的研究大多集中在单词长度和表征频率的相互作用上。关于单词长度和类型频率的关系,人们知之甚少,更不用说类型和标记频率对单词长度的差异影响了。这些问题是根据10种语言的非代表性样本进行审查的。表征频率分析表明,10种语言中有8种语言的频率随着长度的增加而单调下降,而2种语言的分布呈单峰分布。相比之下,在类型频率分析中,所有10种语言都表现出频率曲线的上升,然后是单调下降。类型频率似乎对表征频率分布的性质有显著影响:词汇中单词的平均长度越大,出现单峰分布的概率就越高。解释这些结果需要两个原则——在词典中使用长词的一般反驳和短词的特定语言反驳。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Glottotheory
Glottotheory Arts and Humanities-History
CiteScore
0.70
自引率
0.00%
发文量
8
期刊介绍: The foci of Glottotheory are: observations and descriptions of all aspects of language and text phenomena including the areas of psycholinguistics, sociolinguistics, dialectology, pragmatics, etc. on all levels of linguistic analysis, applications of methods, models or findings from quantitative linguistics concerning problems of natural language processing, language teaching, documentation and information retrieval, methodological problems of linguistic measurement, model construction, sampling and test theory, epistemological issues such as explanation of language and text phenomena, contributions to theory construction, systems theory, philosophy of science. The journal considers itself as platform for a dialogue between quantitative and qualitative linguistics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信