Glottometrics最新文献

筛选
英文 中文
The meaning distributions on different levels of granularity 不同粒度级别上的意义分布
Glottometrics Pub Date : 2023-01-01 DOI: 10.53482/2023_54_405
T. Yih, Haitao Liu
{"title":"The meaning distributions on different levels of granularity","authors":"T. Yih, Haitao Liu","doi":"10.53482/2023_54_405","DOIUrl":"https://doi.org/10.53482/2023_54_405","url":null,"abstract":"The meaning distributions of certain linguistic forms generally follow a Zipfian distribution. However, since the meanings can be observed and classified on different levels of granularity, it is thus interesting to ask whether their distributions on different levels can be fitted by the same model and whether the parameters are the same. In this study, we investigate three quasi-prepositions in Shanghainese, a dialect of Wu Chinese, and test whether the meaning distributions on two levels of granularity can be fitted by the same model and whether the parameters are close. The results first show that the three models proposed by modern quantitative linguists can both achieve a good fit for all cases, while both the exponential (EXP) model and the right-truncated negative binomial (RTBN) models behave better than the modified right-truncated Zipf-Alekseev distribution (MRTZA), in terms of the consistency of the goodness of fit, parameter change, rationality, and simplicity. Second, the parameters of the distributions on the two levels and the curves are not exactly the same or even close to each other. This has supported a weak view of the concept of ‘scaling’ in complex sciences. Finally, differences are found to lie between the distributions on the two levels. The fine-grained meaning distributions are more right-skewed and more non-linear. This is attributed to the openness of the categories of systems. The finer semantic differentiation behaves like systems with open set of categories, while the coarse-grained meaning distribution resembles those having a close set of few categories.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74544443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fellow or foe? A quantitative thematic exploration into Putin's and Trump's stylometric features 伙伴还是敌人?对普京和特朗普文体特征的定量专题探讨
Glottometrics Pub Date : 2023-01-01 DOI: 10.53482/2023_54_406
Yaqin Wang, Ting Zeng
{"title":"Fellow or foe? A quantitative thematic exploration into Putin's and Trump's stylometric features","authors":"Yaqin Wang, Ting Zeng","doi":"10.53482/2023_54_406","DOIUrl":"https://doi.org/10.53482/2023_54_406","url":null,"abstract":"Thematic concentration, a quantitative linguistic method, can reflect the speech style of a particular person. It may, to some degree, reflect the degree of a speaker’s intention to communicate certain themes. There has been limited empirical research on the similarity between Trump and Putin with respect to their linguistic features. Thus, the present study aims to compare Putin’s and Trump’s stylometric features and political themes based on thematic concentration with a corpus of Putin’s, Medvedev’s, Trump’s, and Obama’s speeches. Results show that 1) Both Putin’s and Trump’s speeches’ thematic concentration values are significantly or marginally significantly different from their precedents’. 2) Two leaders pay great attention to the concept of nationalism. 3) Thematic words of their speeches were slightly different across periods, reflecting the influence of external factors on the choice of language. The results of the present study may shed light on the stylometric studies of Putin and Trump.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86828865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comparison of two text specificity measures analyzing a heterogenous text corpus 分析异质文本语料库的两种文本特异性度量的比较
Glottometrics Pub Date : 2023-01-01 DOI: 10.53482/2023_54_404
A. Oleinik
{"title":"A comparison of two text specificity measures analyzing a heterogenous text corpus","authors":"A. Oleinik","doi":"10.53482/2023_54_404","DOIUrl":"https://doi.org/10.53482/2023_54_404","url":null,"abstract":"The article compares the performance of two term specificity measures, Cohen’s d and Z-score, when analyzing political and media discourses on Russia’s war in Ukraine in four languages and five countries. In addition to linguistic and stylistic heterogeneity, 3,347 texts included in the corpus have variable length. The two measures display convergent validity, as confirmed by various performance metrics. It is argued that the measures can be adapted to a broader range of tasks in information retrieval and digital humanities, in addition to their usefulness for text mining and content analysis.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78868378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The journal SMIL - Statistical Methods in Linguistics (1962-1976) - some notes about the history of quantitative linguistics in Scandinavia and beyond 杂志SMIL -语言学统计方法(1962-1976)-一些关于数量语言学在斯堪的纳维亚半岛和超越历史的笔记
Glottometrics Pub Date : 2023-01-01 DOI: 10.53482/2023_54_408
E. Kelih
{"title":"The journal SMIL - Statistical Methods in Linguistics (1962-1976) - some notes about the history of quantitative linguistics in Scandinavia and beyond","authors":"E. Kelih","doi":"10.53482/2023_54_408","DOIUrl":"https://doi.org/10.53482/2023_54_408","url":null,"abstract":"This article deals with the history of quantitative linguistics. The focus of this paper is the journal SMIL – Statistical Methods in Linguistics, which was published by Hans Karlgren in Stockholm from 1962 to 1976 (with a short interruption between 1966 and 1969). SMIL is a representative example of the process of differentiation in quantitative linguistics during the seventies and can be seen as one early major “Scandinavian” contribution to statistical and quantitative linguistics.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78264957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantifying syntax similarity with a polynomial representation of dependency trees 用依赖树的多项式表示量化语法相似度
Glottometrics Pub Date : 2022-11-13 DOI: 10.48550/arXiv.2211.07005
Peng Liu, Tinghao Feng, Rui Liu
{"title":"Quantifying syntax similarity with a polynomial representation of dependency trees","authors":"Peng Liu, Tinghao Feng, Rui Liu","doi":"10.48550/arXiv.2211.07005","DOIUrl":"https://doi.org/10.48550/arXiv.2211.07005","url":null,"abstract":"We introduce a graph polynomial that distinguishes tree structures to represent dependency grammar and a measure based on the polynomial representation to quantify syntax similarity. The polynomial encodes accurate and comprehensive information about the dependency structure and dependency relations of words in a sentence, which enables in-depth analysis of dependency trees with data analysis tools. We apply the polynomial-based methods to analyze sentences in the ParallelUniversal Dependencies treebanks. Specifically, we compare the syntax of sentences and their translations in different languages, and we perform a syntactic typology study of available languages in the Parallel Universal Dependencies treebanks. We also demonstrate and discuss the potential of the methods in measuring syntax diversity of corpora.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84534693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Attempting at parametrization of moderate-length poetic texts: Moses, a poem by Ivan Franko 试图将中等长度的诗歌文本参数化:《摩西》,伊万·弗兰科的一首诗
Glottometrics Pub Date : 2022-01-01 DOI: 10.53482/2022_53_399
S. Buk, Andrij Rovenchak
{"title":"Attempting at parametrization of moderate-length poetic texts: Moses, a poem by Ivan Franko","authors":"S. Buk, Andrij Rovenchak","doi":"10.53482/2022_53_399","DOIUrl":"https://doi.org/10.53482/2022_53_399","url":null,"abstract":"The aim of this study is to find parameters that can be used for classification of not very long texts, for example, by author, genre, etc. We go through various known parameters and analyze to what extent they are useful for the intended purposes. We also suggest some improvements that need to be checked further. We calculate the values of parameters at various points of text comprising N tokens (running words) counted from the beginning of text. As parameters with prospects of author and/or language attribution we identify, in particular, the h-point scaling coefficient, Yule’s K, relative repeat rate, and the fraction of dis legomena. These parameters demonstrate quite stable behavior in N. Another set includes scaling exponents of parameters with respect to N. Certain modifications are suggested for Lambda and entropy introducing logarithmic corrections being powers of ln N. The results are applicable for texts of thousands to tens of thousand words.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78954329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Statistical tests for text homogeneity: using forward and backward processes of numbers of different words 文本同质性的统计测试:使用不同单词数量的前向和后向处理
Glottometrics Pub Date : 2022-01-01 DOI: 10.53482/2022_53_401
Berhane Abebe, M. Chebunin, A. Kovalevskii, N. Zakrevskaya
{"title":"Statistical tests for text homogeneity: using forward and backward processes of numbers of different words","authors":"Berhane Abebe, M. Chebunin, A. Kovalevskii, N. Zakrevskaya","doi":"10.53482/2022_53_401","DOIUrl":"https://doi.org/10.53482/2022_53_401","url":null,"abstract":"The processes of growth in the number of diverse words in a text, when reading in the forward and backward directions, are studied in this article. Based upon the statistics achieved from the difference between these two processes, we construct a statistical test. This statistical test is used for text homogeneity checks. The elementary model states that words in a text are selected from some dictionary independent of each other according to the Zipf–Mandelbrot law. P-values of the statistical test are calculated based on the elementary probabilistic model using the asymptotic normality of corresponding statistics. At last but not least, this statistical test is applied for the analysis of homogeneity of sequences of sonnets.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74519626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dynamics of language in social emergency: investigating COVID-19 hot words on Weibo 社会突发事件中的语言动态:新冠肺炎微博热词调查
Glottometrics Pub Date : 2022-01-01 DOI: 10.53482/2022_52_395
Yi Zhou, Rui Li, Guangfeng Chen, Haitao Liu
{"title":"Dynamics of language in social emergency: investigating COVID-19 hot words on Weibo","authors":"Yi Zhou, Rui Li, Guangfeng Chen, Haitao Liu","doi":"10.53482/2022_52_395","DOIUrl":"https://doi.org/10.53482/2022_52_395","url":null,"abstract":"Drawing on word embeddings techniques and tracking the frequency and semantic change of hot words on Sina Weibo during the COVID-19 pandemic, this study investigates how language and discourse change during crisis. More specifically, correlation tests were conducted between word frequency ranks, pandemic data, and word meaning change ratio. Results indicated that the frequency of some hot words changed with both pandemic data and the frequency of other hot words, which were significantly correlated with the American pandemic data rather than that of China. Moreover, February of 2020 saw the most distinctive semantic changes marked by a large part of the nearest neighbors for WAR metaphors. The correlations between changes in the frequency and nearest neighbors of COVID-19 related hot words exhibited some acceptable peculiarities. This study proves the availability of studying discourse through language change by observing minor semantic change on connotation level from social media, which adds a new perspective to the impact of the COVID-19 pandemic.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76981092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Book review - On Invisible Language in Modern English: A Corpus-based Approach to Ellipsis. By Evelyn Gandón-Chapela. London: Bloomsbury Academic. 2020 书评-现代英语中的隐形语言:基于语料库的省略号分析。伊夫林Gandón-Chapela。伦敦:布鲁姆斯伯里学术出版社,2020
Glottometrics Pub Date : 2022-01-01 DOI: 10.53482/2022_52_398
Zheyuan Dai
{"title":"Book review - On Invisible Language in Modern English: A Corpus-based Approach to Ellipsis. By Evelyn Gandón-Chapela. London: Bloomsbury Academic. 2020","authors":"Zheyuan Dai","doi":"10.53482/2022_52_398","DOIUrl":"https://doi.org/10.53482/2022_52_398","url":null,"abstract":"","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80534655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Corpus-Driven Study of the Style Variation in The Grapes of Wrath 《愤怒的葡萄》文体变异的语料库驱动研究
Glottometrics Pub Date : 2022-01-01 DOI: 10.53482/2022_52_396
Yiyang Hu, Qingshun He
{"title":"A Corpus-Driven Study of the Style Variation in The Grapes of Wrath","authors":"Yiyang Hu, Qingshun He","doi":"10.53482/2022_52_396","DOIUrl":"https://doi.org/10.53482/2022_52_396","url":null,"abstract":"The novel The Grapes of Wrath is distinctive in the arrangement of intercalary chapters and narrative chapters. Existing studies of the narratological distinction of this novel are primarily qualitative. This article conducted a corpus-driven study of the variation of styles in this novel from the perspectives of word cluster, type-token ratio, descriptivity and activity, keyness, and sentiment. The cluster analysis shows that the choice of words in the narrative chapters is more consistent than that in the intercalary chapters. The type-token ratio analysis testifies to the heterogeneity of the intercalary chapters in terms of lexical richness. The descriptivity and activity analysis and the keyness analysis reveal that the narrative chapters are more active than the intercalary chapters. The sentiment analysis finds that the novel is pervaded by negative sentiments and that negative sentiments are more prevalent in the narrative chapters than in the intercalary chapters. The research concludes that the corpus-driven study can provide insights into the narrative structure and the stylistic variation of the novel.","PeriodicalId":51918,"journal":{"name":"Glottometrics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78928455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信