{"title":"计数原理让数词独一无二","authors":"Mira Ariel, Natalia Levshina","doi":"10.1515/cllt-2023-0105","DOIUrl":null,"url":null,"abstract":"\n Following Ariel (2021. Why it’s hard to construct ad hoc number concepts. In Caterina Mauri, Ilaria Fiorentini, & Eugenio Goria (eds.), Building categories in interaction: Linguistic resources at work, 439–462. Amsterdam: John Benjamins), we argue that number words manifest distinct distributional patterns from open-class lexical items. When modified, open-class words typically take selectors (as in kinda table), which select a subset of their potential denotations (e.g., “nonprototypical table”). They are typically not modified by loosening operators (e.g., approximately), since even if bare, typical lexemes can broaden their interpretation (e.g., table referring to a rock used as a table). Number words, on the other hand, have a single, precise meaning and denotation and cannot take a selector, which would need to select a subset of their (single) denotation (??kinda seven). However, they are often overtly broadened (approximately seven), creating a range of values around N. First, we extend Ariel’s empirical examination to the larger COCA and to Hebrew (HeTenTen). Second, we propose that open-class and number words belong to sparse versus dense lexical domains, respectively, because the former exhibit prototypicality effects, but the latter do not. Third, we further support the contrast between sparse and dense domains by reference to: synchronic word2vec models of sparse and dense lexemes, which testify to their differential distributions, numeral use in noncounting communities, and different renewal rates for the two lexical types.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The counting principle makes number words unique\",\"authors\":\"Mira Ariel, Natalia Levshina\",\"doi\":\"10.1515/cllt-2023-0105\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Following Ariel (2021. Why it’s hard to construct ad hoc number concepts. In Caterina Mauri, Ilaria Fiorentini, & Eugenio Goria (eds.), Building categories in interaction: Linguistic resources at work, 439–462. Amsterdam: John Benjamins), we argue that number words manifest distinct distributional patterns from open-class lexical items. When modified, open-class words typically take selectors (as in kinda table), which select a subset of their potential denotations (e.g., “nonprototypical table”). They are typically not modified by loosening operators (e.g., approximately), since even if bare, typical lexemes can broaden their interpretation (e.g., table referring to a rock used as a table). Number words, on the other hand, have a single, precise meaning and denotation and cannot take a selector, which would need to select a subset of their (single) denotation (??kinda seven). However, they are often overtly broadened (approximately seven), creating a range of values around N. First, we extend Ariel’s empirical examination to the larger COCA and to Hebrew (HeTenTen). Second, we propose that open-class and number words belong to sparse versus dense lexical domains, respectively, because the former exhibit prototypicality effects, but the latter do not. Third, we further support the contrast between sparse and dense domains by reference to: synchronic word2vec models of sparse and dense lexemes, which testify to their differential distributions, numeral use in noncounting communities, and different renewal rates for the two lexical types.\",\"PeriodicalId\":45605,\"journal\":{\"name\":\"Corpus Linguistics and Linguistic Theory\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Corpus Linguistics and Linguistic Theory\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1515/cllt-2023-0105\",\"RegionNum\":2,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Corpus Linguistics and Linguistic Theory","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1515/cllt-2023-0105","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
摘要
继阿里尔(2021.为什么难以构建特设数字概念?见 Caterina Mauri, Ilaria Fiorentini, & Eugenio Goria (eds.), Building categories in interaction:Linguistic resources at work, 439-462.阿姆斯特丹:John Benjamins),我们认为数词表现出与开放类词项不同的分布模式。当被修改时,开放类词汇通常会使用选择器(如 kinda table),选择其潜在指称的一个子集(如 "非原型表")。它们通常不会被松散运算符(如 "大约")修饰,因为即使是裸词,典型词素也可以扩大它们的释义范围(如 "桌子 "指的是用作桌子的石头)。首先,我们将 Ariel 的实证研究扩展到更大的 COCA 和希伯来语(HeTenTen)。其次,我们提出开放类词和数字词分别属于稀疏词域和密集词域,因为前者表现出原型效应,而后者则没有。第三,我们进一步支持稀疏词域和密集词域之间的对比,我们参考了稀疏词域和密集词域的同步 word2vec 模型,这些模型证明了稀疏词域和密集词域的不同分布,数字词在非计数社区中的使用,以及这两类词的不同更新率。
Following Ariel (2021. Why it’s hard to construct ad hoc number concepts. In Caterina Mauri, Ilaria Fiorentini, & Eugenio Goria (eds.), Building categories in interaction: Linguistic resources at work, 439–462. Amsterdam: John Benjamins), we argue that number words manifest distinct distributional patterns from open-class lexical items. When modified, open-class words typically take selectors (as in kinda table), which select a subset of their potential denotations (e.g., “nonprototypical table”). They are typically not modified by loosening operators (e.g., approximately), since even if bare, typical lexemes can broaden their interpretation (e.g., table referring to a rock used as a table). Number words, on the other hand, have a single, precise meaning and denotation and cannot take a selector, which would need to select a subset of their (single) denotation (??kinda seven). However, they are often overtly broadened (approximately seven), creating a range of values around N. First, we extend Ariel’s empirical examination to the larger COCA and to Hebrew (HeTenTen). Second, we propose that open-class and number words belong to sparse versus dense lexical domains, respectively, because the former exhibit prototypicality effects, but the latter do not. Third, we further support the contrast between sparse and dense domains by reference to: synchronic word2vec models of sparse and dense lexemes, which testify to their differential distributions, numeral use in noncounting communities, and different renewal rates for the two lexical types.
期刊介绍:
Corpus Linguistics and Linguistic Theory (CLLT) is a peer-reviewed journal publishing high-quality original corpus-based research focusing on theoretically relevant issues in all core areas of linguistic research, or other recognized topic areas. It provides a forum for researchers from different theoretical backgrounds and different areas of interest that share a commitment to the systematic and exhaustive analysis of naturally occurring language. Contributions from all theoretical frameworks are welcome but they should be addressed at a general audience and thus be explicit about their assumptions and discovery procedures and provide sufficient theoretical background to be accessible to researchers from different frameworks. Topics Corpus Linguistics Quantitative Linguistics Phonology Morphology Semantics Syntax Pragmatics.