{"title":"Derivational Suffix Productivity in Persian: A Fuzzy Analysis","authors":"S. Z. Aftabi, A. Ahangar, H. M. Nehi","doi":"10.1080/09296174.2021.1887575","DOIUrl":"https://doi.org/10.1080/09296174.2021.1887575","url":null,"abstract":"ABSTRACT The main aim of this article is to introduce a new way of dealing with the vague concept of suffix productivity in Persian. This approach, that is fuzzy set theory, gives each suffix a degree of membership from [0,1] to different productivity categories. To estimate morphological productivity of Persian suffixes, first Baayen’s proposed measures, i.e. realized productivity, expanding productivity and potential productivity were applied to Bijankhan corpus. Correspondingly, 2.6 million words in the corpus were investigated and analysed using MATLAB and Microsoft Excel software. In the next step, the results of the three productivity measures were illustrated on separate fuzzy diagrams. The findings showed that the three measures employed could give a broader view of different aspects of derivational suffix productivity in Persian. Using fuzzy set theory makes it possible for a given suffix to belong simultaneously to different categories with different degrees of membership. According to the statistics of this research, suffixes – i and – e in Persian had the highest degrees of membership among the most productive suffixes up to now. Likewise, they continue to contribute the most to the growth rate of the contemporary Persian lexicon.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"387 - 411"},"PeriodicalIF":1.4,"publicationDate":"2021-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2021.1887575","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41351501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interactive Heatmaps as an Improved Means of Analysing Complex Socio-dialectal Patterns: German Loans in Silesian","authors":"I. Fekete, G. Hentschel","doi":"10.1080/09296174.2021.1898089","DOIUrl":"https://doi.org/10.1080/09296174.2021.1898089","url":null,"abstract":"ABSTRACT This paper presents an application of interactive cluster heatmaps in sociolinguistics, a method hitherto scarcely employed in the field. To that end, we developed a statistical workflow to illustrate the method and analyse large-scale Silesian questionnaire data. In our quantitative-linguistic study we demonstrate how heatmaps can uncover information about complex patterns of regional variation, thereby highlighting the added value of the method relative to standard statistical procedures. Specifically, we show (i) how differences in language use between two regions can be determined by deploying the heatmap method but not with traditional significance/hypothesis testing statistical procedures or summary statistics, (ii) and how differences in cohesion and tightness in clusters can be examined via heatmaps but not using cluster analysis. We conclude that heatmaps are a valuable tool for assessing why and how certain word-items group together because of the regional distribution of their usage. A major advantage of the heatmap method is that it can handle two dimensions with hundreds of instantiations and illustrate their interrelations, which would pose problems for traditional statistical techniques. Heatmaps provide a novel and accessible way of exploring large-scale sociolinguistic data in their entirety and of generating further hypotheses.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"425 - 449"},"PeriodicalIF":1.4,"publicationDate":"2021-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2021.1898089","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43007009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating Phonetic Probability in Etymology","authors":"Kamil Stachowski","doi":"10.1080/09296174.2021.1877458","DOIUrl":"https://doi.org/10.1080/09296174.2021.1877458","url":null,"abstract":"ABSTRACT An etymological proposition is often said to be probable or improbable from the phonetic point of view, and it is not rare for opinions to diverge on which it is. The estimation is typically purely intuitive, based on perceived similarity and no more than a handful of analogous examples. This paper proposes a method for quantifying the phonetic probability of an etymology and comparing it to the alternative hypothesis. It is intended to be used with sizable datasets, to produce a well-supported, objective verdict.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"339 - 349"},"PeriodicalIF":1.4,"publicationDate":"2021-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2021.1877458","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43003497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sevil Çaliskan, F. Can, Hasan Akbulut, S. R. Öztürk
{"title":"Quantitative Analysis of Spoken Discourse Using Memoirs of Old-time Moviegoers","authors":"Sevil Çaliskan, F. Can, Hasan Akbulut, S. R. Öztürk","doi":"10.1080/09296174.2021.1873574","DOIUrl":"https://doi.org/10.1080/09296174.2021.1873574","url":null,"abstract":"ABSTRACT We present the first quantitative analysis of spoken discourse for the Turkish language using memoirs of a group of old-time moviegoers of varying age groups whose birth year spreads over a period of four decades ranging from the 1930s to the 1960s. They tell their experiences by answering a set of questions. Their responses are evaluated comprehensively with the expectation that various attributes of the participants are reflected by their everyday speaking language. We also investigate their language characteristics in terms of their vocabulary richness and word usage. The results show that the age and gender of the participants can be inferred to some extent from their speech, as is the case for written text. However, the difference is not significant in the language use of younger and older respondents in terms of vocabulary richness and archaic word usage. With additional data obtained for some participants, it is shown that text can be accurately identified as being either spoken or written; however, the spoken text of a person can only be differentiated from their written text with the accuracy level of a random guess.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"283 - 313"},"PeriodicalIF":1.4,"publicationDate":"2021-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2021.1873574","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42600496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Why Do Parameter Values in the Zipf-Mandelbrot Distribution Sometimes Explode?","authors":"Ján Mačutek","doi":"10.1080/09296174.2021.1887613","DOIUrl":"https://doi.org/10.1080/09296174.2021.1887613","url":null,"abstract":"ABSTRACT The Zipf-Mandelbrot distribution serves as a mathematical model for ranked frequencies in many areas of scientific research, including linguistics. Many linguistic units, like e.g., words or word n-grams, follow this distribution. However, in some cases, such as for graphemes in linguistics or species abundance and diversity data in biology, the parameters of the Zipf-Mandelbrot distribution are virtually uninterpretable, as their values strongly depend on the precision of numerical methods used to estimate them (values from several tens to several hundreds are not uncommon). It is shown in the paper that these values can be explained by the convergence to the geometric distribution, which forces both parameters of the Zipf-Mandelbrot distribution to increase to infinity while their ratio converges to a constant. Some examples which illustrate this limit behaviour are presented.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"413 - 424"},"PeriodicalIF":1.4,"publicationDate":"2021-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2021.1887613","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47499376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Experiments in Text Classification: Analyzing the Sentiment of Electronic Product Reviews in Greek","authors":"Dimitris Bilianos","doi":"10.1080/09296174.2021.1885872","DOIUrl":"https://doi.org/10.1080/09296174.2021.1885872","url":null,"abstract":"ABSTRACT Sentiment analysis, which deals with people’s sentiments as they appear in the growing amount of online social data, has been on the rise in the past few years. In its simplest form, sentiment analysis deals with the polarity of a given text, i.e., whether the opinion expressed in it is positive or negative. Sentiment analysis, or opinion mining applications on websites and the social media range from product reviews and brand reception to political issues and the stock market. The vast majority of the research in sentiment analysis has mostly dealt with English data, where there’s an abundance of readily available and annotated for sentiment corpora. With a few notable exceptions, the research in other minor languages such as Greek is lacking. This paper deals with sentiment analysis of electronic product reviews written in Greek. To this end, a small dataset of 480 positive and negative reviews is compiled and used, taken from the popular Greek e-commerce website, www.skroutz.gr. Different computational models for training and testing the dataset are evaluated, ranging from simple Naive Bayes with n-gram features to state-of-the-art BERT. The results look very promising for such a small corpus.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"374 - 386"},"PeriodicalIF":1.4,"publicationDate":"2021-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2021.1885872","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43507187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diachronic Distribution of Elemental Ordering in English","authors":"Jiangping Zhou, Yanmei Gao","doi":"10.1080/09296174.2021.1880201","DOIUrl":"https://doi.org/10.1080/09296174.2021.1880201","url":null,"abstract":"ABSTRACT English elemental ordering in a non-canonical word order incorporates preposing, postposing and elemental reversal. This paper intends to explore how these types of elemental ordering are distributed during the last two centuries by employing the Corpus of Historical American English or COHA. The findings demonstrate that preposing has been increasing apparently but still in its inceptive phase; postposing, subsuming existential there and presentational there, generally keeps plateauing with existential there dominating presentational there; and elemental reversal experiences a trend of gradual decreasing, which is attributed to the dominance in occurrences of passive with a by phrase construction over those of inversion. This research provides us with a corpus linguistic insight of exploring elemental ordering, which distinguishes itself from other researches focusing on information status.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"350 - 373"},"PeriodicalIF":1.4,"publicationDate":"2021-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2021.1880201","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44673435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Book Review of Corpus Stylistics: Theory and Practice","authors":"Q. Jiang, Yaqin Wang","doi":"10.1080/09296174.2020.1866806","DOIUrl":"https://doi.org/10.1080/09296174.2020.1866806","url":null,"abstract":"","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"28 1","pages":"282 - 287"},"PeriodicalIF":1.4,"publicationDate":"2021-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2020.1866806","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43821008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting Keyword Analysis in a Specialized Corpus: Religious Terminology Extraction","authors":"Hsin-Yi Lien","doi":"10.1080/09296174.2020.1865668","DOIUrl":"https://doi.org/10.1080/09296174.2020.1865668","url":null,"abstract":"ABSTRACT This study investigates keyword extraction using a compiled Buddhist corpus. It sets out the fundamental mode of generation and refinement of keywords with statistical measures and manual screening with specific criteria. The Buddhist Word List contains 1244 keywords with 375 Pali words in Buddhist literacy. We compared the results of applying occurring frequency, log-likelihood (LL), and odds ratio (OR) in keyword analyses, each of which resulted in different keyword rankings. Our results show that statistical measures are useful for the identification of particular keywords in specific fields and OR is more effective in identifying technical terms. We demonstrate that multilevel keyword analysis is more effective at the identification of high-frequency technical words than either of these methods used alone. Multilevel methods are recommended for the creation of future domain-specific vocabulary lists to overcome the inherent flaws of individual analytic methods.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"269 - 282"},"PeriodicalIF":1.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2020.1865668","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46172250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contrastive Analysis of Discursive Constructions in Terrorist Attack Reports between Chinese and British Newspapers: Case Study of Reports on Beijing and Barcelona Terrorist Attacks","authors":"Hui Qi, F. Ye","doi":"10.1080/09296174.2019.1595901","DOIUrl":"https://doi.org/10.1080/09296174.2019.1595901","url":null,"abstract":"ABSTRACT Chinese and British mainstream English newspaper’s reports on two terrorist attacks, Beijing 10 · 28 event in 2013 and Barcelona 8 · 17 event in 2017 were used as data. Corpus approaches and Critical Discourse Analysis (CDA) were combined to compare the discursive constructions of terrorist attacks between Chinese and British presses. The findings demonstrate that the western media set the double standards when reporting terrorist events in China and Spain. Although with the same standard, China not only seeks to reveal the true nature of Beijing terrorist attack but also weaves a network of ideological allies. The differences in discursive constructions in Beijing terrorist attack exhibit the different ideologies between China and western countries and reveal the predominance of western countries in controlling the discourse power in the world.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"361 - 378"},"PeriodicalIF":1.4,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2019.1595901","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43606550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}