Productivity and semantic transparency

IF 1.3 Q3 LINGUISTICS

Mental Lexicon Pub Date : 2023-03-28 DOI:10.1075/ml.22009.she

Shen Tian, H. Baayen

{"title":"Productivity and semantic transparency","authors":"Shen Tian, H. Baayen","doi":"10.1075/ml.22009.she","DOIUrl":null,"url":null,"abstract":"\nWe used word embeddings to study the relation between productivity and semantic transparency. We compiled a dataset with around 2700 two-syllable compounds that shared position-specific constituents (henceforth pivots) and some 1100 suffixed words. For each pivot and suffix, we calculated measures of productivity as well as measures of semantic transparency. For compounds, productivity (P) was negatively correlated with the number of types (V) and with the semantic similarity between non-pivot constituents and their compounds. Conversely, the greater semantic similarity of the pivot with either the compound or the non-pivot constituent predicted higher degrees of productivity. Visualization with t-SNE revealed clustering of suffixed words’ embeddings, but no by-pivot clustering for compounds, except for a minority of pivots whose regions in semantic space did not contain intruding unrelated compounds. A subset of these pivots was found to realize a fixed shift in semantic space from the base word to the corresponding compound, a property that also emerged for several suffixes. For these pivots, no correlation between P and V was present. Thus, Mandarin compounds appear to realize, at one extreme, motivated but unsystematic concept formation (where other pivots could just as well have been used), and at the other extreme, systematic suffix-like semantics.","PeriodicalId":45215,"journal":{"name":"Mental Lexicon","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mental Lexicon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1075/ml.22009.she","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"LINGUISTICS","Score":null,"Total":0}

引用次数: 0

Abstract

We used word embeddings to study the relation between productivity and semantic transparency. We compiled a dataset with around 2700 two-syllable compounds that shared position-specific constituents (henceforth pivots) and some 1100 suffixed words. For each pivot and suffix, we calculated measures of productivity as well as measures of semantic transparency. For compounds, productivity (P) was negatively correlated with the number of types (V) and with the semantic similarity between non-pivot constituents and their compounds. Conversely, the greater semantic similarity of the pivot with either the compound or the non-pivot constituent predicted higher degrees of productivity. Visualization with t-SNE revealed clustering of suffixed words’ embeddings, but no by-pivot clustering for compounds, except for a minority of pivots whose regions in semantic space did not contain intruding unrelated compounds. A subset of these pivots was found to realize a fixed shift in semantic space from the base word to the corresponding compound, a property that also emerged for several suffixes. For these pivots, no correlation between P and V was present. Thus, Mandarin compounds appear to realize, at one extreme, motivated but unsystematic concept formation (where other pivots could just as well have been used), and at the other extreme, systematic suffix-like semantics.

查看原文本刊更多论文

生产力和语义透明度

我们使用词嵌入来研究生产力和语义透明度之间的关系。我们编制了一个包含大约2700个双音节复合词的数据集，这些复合词共享位置特定成分(从今以后的支点)和大约1100个带后缀的单词。对于每个支点和后缀，我们计算了生产力的度量以及语义透明度的度量。对于化合物，生产力(P)与类型数(V)和非支点成分与其化合物之间的语义相似度呈负相关。相反，枢轴与化合物或非枢轴成分的语义相似性越高，预示着生产率的提高。t-SNE可视化显示了后缀词嵌入的聚类，但化合物没有按枢轴聚类，除了少数枢轴在语义空间的区域不包含入侵的不相关化合物。这些支点的一个子集被发现实现了语义空间从基本词到相应复合词的固定转移，这一特性也出现在几个后缀中。对于这些支点，P和V之间不存在相关性。因此，汉语复合词似乎在一个极端实现了有动机但不系统的概念形成(其他支点也可以使用)，在另一个极端实现了系统的类后缀语义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mental Lexicon LINGUISTICS-

CiteScore

1.50

自引率

0.00%

发文量

期刊介绍： The Mental Lexicon is an interdisciplinary journal that provides an international forum for research that bears on the issues of the representation and processing of words in the mind and brain. We encourage both the submission of original research and reviews of significant new developments in the understanding of the mental lexicon. The journal publishes work that includes, but is not limited to the following: Models of the representation of words in the mind Computational models of lexical access and production Experimental investigations of lexical processing Neurolinguistic studies of lexical impairment. Functional neuroimaging and lexical representation in the brain Lexical development across the lifespan Lexical processing in second language acquisition The bilingual mental lexicon Lexical and morphological structure across languages Formal models of lexical structure Corpus research on the lexicon New experimental paradigms and statistical techniques for mental lexicon research.