Journal of Quantitative Linguistics最新文献

筛选
英文 中文
A Comprehensive Study of the Parameters in the Creation and Comparison of Feature Vectors in Distributional Semantic Models 分布语义模型中特征向量创建与比较的参数综合研究
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-03-12 DOI: 10.1080/09296174.2019.1570897
A. Dobó, J. Csirik
{"title":"A Comprehensive Study of the Parameters in the Creation and Comparison of Feature Vectors in Distributional Semantic Models","authors":"A. Dobó, J. Csirik","doi":"10.1080/09296174.2019.1570897","DOIUrl":"https://doi.org/10.1080/09296174.2019.1570897","url":null,"abstract":"ABSTRACT Measuring the semantic similarity and relatedness of words can play a vital role in many natural language processing tasks. Distributional semantic models computing these measures can have many different parameters, such as different weighting schemes, vector similarity measures, feature transformation functions and dimensionality reduction techniques. Despite their importance there is no truly comprehensive study simultaneously evaluating the numerous parameters of such models, while also considering the interaction of these parameters with each other. We would like to address this gap with our systematic study. Taking the necessary distributional information extracted from the chosen dataset as already granted, we evaluate all important aspects of the creation and comparison of feature vectors in distributional semantic models. Testing altogether 10 parameters simultaneously, we try to find the best combination of parameter settings, with a large number of settings examined in case of some of the parameters. Beside evaluating the conventionally used settings for the parameters, we also propose numerous novel variants, as well as novel combinations of parameter settings, some of which significantly outperform the combinations of settings in general use, thus achieving state-of-the-art results.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2019.1570897","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48204474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Systemic Dynamics Model of Text Production 文本生产的系统动力学模型
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-02-11 DOI: 10.1080/09296174.2019.1567301
Giacomo P. Figueredo, G. Figueredo
{"title":"A Systemic Dynamics Model of Text Production","authors":"Giacomo P. Figueredo, G. Figueredo","doi":"10.1080/09296174.2019.1567301","DOIUrl":"https://doi.org/10.1080/09296174.2019.1567301","url":null,"abstract":"ABSTRACT This paper introduces a quantitative model of text as it unfolds in time. The model conceptualizes text as a functional unit of language. This organization can be difficult to identify because it forms complex patterns of linguistic laws, probability and dynamics. These patterns are covert configurations and need complex methods to be investigated. One such method is to draw from qualitative frameworks derived from the quantitative properties of language. Previous studies have shown that covert configurations can be obtained through qualitative frameworks. When dynamics is considered, however, a model of text production including the variable time is needed. This paper therefore aims at addressing this research gap by proposing a dynamics model of text unfolding. It draws from systemic theory and models its categories quantitatively. Time is introduced as variation of choice. The model is applied to a sample of text. Results show how individual choices contribute to text unfolding – describing the amount of meanings at any given moment in text time. In addition, the dynamic accumulation indicates core characteristics of a text, which can be further explored in text behaviour and typology.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2019.1567301","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59838178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Quantitative Approaches to the Russian Language 俄语的量化方法
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-01-10 DOI: 10.1080/09296174.2018.1558834
E. Kelih
{"title":"Quantitative Approaches to the Russian Language","authors":"E. Kelih","doi":"10.1080/09296174.2018.1558834","DOIUrl":"https://doi.org/10.1080/09296174.2018.1558834","url":null,"abstract":"The omnibus volume under review comprises 10 individual chapters by 22 authors, thus most of the chapters are co-authored. This seems to reflect the overall interdisciplinary approach focus of the ...","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1558834","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43913979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Statistical Analysis of the Tables in Mahadevan’s Concordance of the Indus Valley Script 《印度河流域文字玛哈德万汇编》表的统计分析
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-01-02 DOI: 10.1080/09296174.2017.1406294
M. Oakes
{"title":"Statistical Analysis of the Tables in Mahadevan’s Concordance of the Indus Valley Script","authors":"M. Oakes","doi":"10.1080/09296174.2017.1406294","DOIUrl":"https://doi.org/10.1080/09296174.2017.1406294","url":null,"abstract":"Abstract The Indus Script originates from the culture known as the Indus Valley Civilization, which flourished from approximately 2600 to 1900 bc. Several thousand objects bearing these signs have been found over a wide area of Northern India and Pakistan. In 1977, Iravatham Mahadevan published a concordance of all of the scripts that had been discovered so far. Accompanying the concordance are a set of nine tables showing the distribution of individual signs by position, archaeological site, object type, field symbol (accompanying image), and direction of writing. Analysis of the frequencies of the signs found so far using Large Numbers of Rare Events (LNRE) models estimated the total vocabulary of the language, including signs not yet found, to be about 857. All the tables were analysed using Pearson’s residuals, and it was found that the signs were not randomly distributed, but some showed statistically significant associations with position, object, field symbol or direction of writing. A more detailed analysis of the relation between signs and field symbols was made using correspondence analysis, which showed that certain signs were associated with the unicorn symbol, while others were associated with the gharial and dotted circle symbols.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1406294","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41996848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the ‘Stickiness’ of Words. A Comparative Language Study Screening the Internet for English, German, French and Latin Phrases 关于单词的“粘性”。一项比较语言研究:筛选网络上的英语、德语、法语和拉丁语短语
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-01-02 DOI: 10.1080/09296174.2018.1451206
M. Berger
{"title":"On the ‘Stickiness’ of Words. A Comparative Language Study Screening the Internet for English, German, French and Latin Phrases","authors":"M. Berger","doi":"10.1080/09296174.2018.1451206","DOIUrl":"https://doi.org/10.1080/09296174.2018.1451206","url":null,"abstract":"Abstract Language, one of the defining attributes of Homo sapiens, not only deploys as a chain of words. Rather, words group together in a non-random way to form phrases. Here, the world-wide web was searched for idiomatic expressions in three living and one extinct language: 1102 English, 1183 German, 1138 French and 1128 Latin phrases distributed into three categories, with high, middle and low frequencies. High-frequency phrases such as in addition to and as a matter of fact constituted 49.5% of all English phrases, but only 9.0% of the French and 2.5% of the German ones. The middle-frequency category with classical idioms such as a bitter pill or carved in stone comprised 34.9% of the English, 33.0% of the French, and 24.9% of the German phrases. Most French and German phrases were of low frequency. Latin phrases were found as often as French and more often than German ones in the world-wide web, and exhibited a frequency distribution similar to those of French and German. Frequency distributions yielded three main categories around similar maxima for all four languages, with differing relative proportions. The internet may prove useful for the quantitative comparison of languages.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1451206","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44814266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Levels of Statistical Use in Applied Linguistics Research Articles: From 1986 to 2015 应用语言学研究文章中的统计使用水平:从1986年到2015年
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-01-02 DOI: 10.1080/09296174.2017.1421498
Reza Khany, Khalil Tazik
{"title":"Levels of Statistical Use in Applied Linguistics Research Articles: From 1986 to 2015","authors":"Reza Khany, Khalil Tazik","doi":"10.1080/09296174.2017.1421498","DOIUrl":"https://doi.org/10.1080/09296174.2017.1421498","url":null,"abstract":"Abstract The main objective of this study is to assess the levels of statistical use (basic, intermediate, and advanced) in Applied Linguistics research articles over the past three decades (from 1986 to 2015). The corpus included 4079 quantitative and mixed-methods studies published in ten prominent journals of Applied Linguistics. The articles were analysed and the statistical techniques used were aggregated by two current writers and four PhD students in TEFL. Results showed that descriptive statistics (40.04%) were by far the most commonly used technique followed by one-way ANOVA (14.91%), t-test (10.15%), and Pearson correlation (8.76%). Regarding the sophistication level of statistical use, about 78.77% (n = 4686) of the techniques were classified as basic, 14.49% (n = 862) as intermediate, and 6.74% (n = 401) as advanced. Clearly, most of the techniques were either basic or intermediate, with a significant higher percentage for the former. So, a person with basic knowledge of statistics could understand 69.03% of the papers published during 1986 to 2015. It is discussed that researchers should be updated on recent statistical knowledge if they wish to statistically comprehend research articles published in Applied Linguistics journals.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1421498","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45244061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Menzerath-Altmann Law and Prothetic /v/ in Spoken Czech Menzerath-Altmann Law和prosthetics /v/ in口语捷克语
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-01-02 DOI: 10.1080/09296174.2018.1424493
Ján Mačutek, J. Chromý, M. Koščová
{"title":"Menzerath-Altmann Law and Prothetic /v/ in Spoken Czech","authors":"Ján Mačutek, J. Chromý, M. Koščová","doi":"10.1080/09296174.2018.1424493","DOIUrl":"https://doi.org/10.1080/09296174.2018.1424493","url":null,"abstract":"Abstract This paper discusses the Menzerath-Altmann law in general at first, then it is shown that the law is valid in spoken Czech. In particular, the relation between word length (measured in the number of syllables) and the mean syllable length (measured in the number of phonemes) is investigated. In addition, we model the relation between the relative occurrence of prothetic /v/ in words and word stems which, according to the official norms of the Czech language, begin with phoneme /o/, and word length in syllables in these words.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1424493","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41544845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The Stylometric Impacts of Ageing and Life Events on Identity 年龄和生活事件对身份的文体影响
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2019-01-02 DOI: 10.1080/09296174.2017.1405719
D. Kernot, T. Bossomaier, R. Bradbury
{"title":"The Stylometric Impacts of Ageing and Life Events on Identity","authors":"D. Kernot, T. Bossomaier, R. Bradbury","doi":"10.1080/09296174.2017.1405719","DOIUrl":"https://doi.org/10.1080/09296174.2017.1405719","url":null,"abstract":"Abstract Using data containing stylometric markers for depression and Alzheimer’s disease, the 45 novels of Iris Murdoch and P.D. James are examined to see if a signature of an individual, their personality, changes over time due to life events and natural ageing. We use variants of the critical slowing down 1-lag autocorrelation and coefficient of skewness techniques with a multivariate identity measure, RPAS to visualize these changes. We find that life events such as depression, anxiety, and Alzheimer’s disease might be identified outside of natural ageing through a tipping point phenomenon. We believe these techniques might be a useful self-help tool to aid in the signalling of depressive episodes, such as averting suicide, and the early identification of Alzheimer’s disease, or for law enforcement personnel monitoring terrorists on watch lists.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1405719","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46875671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Is the Menzerath-Altmann Law Specific to Certain Languages in Certain Registers? Menzerath Altmann定律是特定于某些寄存器中的某些语言的吗?
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2018-10-18 DOI: 10.1080/09296174.2018.1532158
Lirong Xu, Lianzhen He
{"title":"Is the Menzerath-Altmann Law Specific to Certain Languages in Certain Registers?","authors":"Lirong Xu, Lianzhen He","doi":"10.1080/09296174.2018.1532158","DOIUrl":"https://doi.org/10.1080/09296174.2018.1532158","url":null,"abstract":"ABSTRACT Since its formulation, the Menzerath-Altmann law (MAL) has gone through continuing validation and development when applied to different languages or different language units. However, whether the MAL still holds true irrespective of spoken or written register remains a controversial issue. This article endeavours to re-examine the MAL by investigating the correlation between the length of English sentences (measured in the number of clauses) and their constituting clause length (measured in the number of words) in both academic spoken and written registers. It is observed that the MAL is valid in both registers. Further, the fitted parameter values of the MAL can serve as good predictors for register differentiation.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2018-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1532158","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45103238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Dynamic Lexical Features of PhD Theses across Disciplines: A Text Mining Approach 跨学科博士论文的动态词汇特征:一种文本挖掘方法
IF 1.4 2区 文学
Journal of Quantitative Linguistics Pub Date : 2018-10-15 DOI: 10.1080/09296174.2018.1531618
Wei Xiao, S. Sun
{"title":"Dynamic Lexical Features of PhD Theses across Disciplines: A Text Mining Approach","authors":"Wei Xiao, S. Sun","doi":"10.1080/09296174.2018.1531618","DOIUrl":"https://doi.org/10.1080/09296174.2018.1531618","url":null,"abstract":"ABSTRACT This study employed a text mining method to investigate the lexical features and their dynamic changes of PhD theses across the natural sciences, social sciences and humanities. Four quantitative indices, i.e. TTR, h-point, R1 and writer’s view, were employed to analyze 150 PhD theses (50 theses from each discipline). Although h-point and writer’s view were found counter-intuitively to show insignificant variation across disciplines, the results of TTR and R1 did reveal sharp contrasts between theses in humanities and natural sciences. While the second half of humanities theses showed a significantly higher level of lexical diversity, indicated by higher TTR, theses in natural sciences tended to be richer in content words in the first half, indicated by a higher R1. Meanwhile, theses in social sciences seemed to be more moderate, with features lying in the middle position. This study has implications not only for the widening of applications of quantitative linguistic methods but also for academic writing (especially PhD thesis writing) instruction and practice.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1531618","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49664793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信