{"title":"Levels of Statistical Use in Applied Linguistics Research Articles: From 1986 to 2015","authors":"Reza Khany, Khalil Tazik","doi":"10.1080/09296174.2017.1421498","DOIUrl":"https://doi.org/10.1080/09296174.2017.1421498","url":null,"abstract":"Abstract The main objective of this study is to assess the levels of statistical use (basic, intermediate, and advanced) in Applied Linguistics research articles over the past three decades (from 1986 to 2015). The corpus included 4079 quantitative and mixed-methods studies published in ten prominent journals of Applied Linguistics. The articles were analysed and the statistical techniques used were aggregated by two current writers and four PhD students in TEFL. Results showed that descriptive statistics (40.04%) were by far the most commonly used technique followed by one-way ANOVA (14.91%), t-test (10.15%), and Pearson correlation (8.76%). Regarding the sophistication level of statistical use, about 78.77% (n = 4686) of the techniques were classified as basic, 14.49% (n = 862) as intermediate, and 6.74% (n = 401) as advanced. Clearly, most of the techniques were either basic or intermediate, with a significant higher percentage for the former. So, a person with basic knowledge of statistics could understand 69.03% of the papers published during 1986 to 2015. It is discussed that researchers should be updated on recent statistical knowledge if they wish to statistically comprehend research articles published in Applied Linguistics journals.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"48 - 65"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1421498","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45244061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the ‘Stickiness’ of Words. A Comparative Language Study Screening the Internet for English, German, French and Latin Phrases","authors":"M. Berger","doi":"10.1080/09296174.2018.1451206","DOIUrl":"https://doi.org/10.1080/09296174.2018.1451206","url":null,"abstract":"Abstract Language, one of the defining attributes of Homo sapiens, not only deploys as a chain of words. Rather, words group together in a non-random way to form phrases. Here, the world-wide web was searched for idiomatic expressions in three living and one extinct language: 1102 English, 1183 German, 1138 French and 1128 Latin phrases distributed into three categories, with high, middle and low frequencies. High-frequency phrases such as in addition to and as a matter of fact constituted 49.5% of all English phrases, but only 9.0% of the French and 2.5% of the German ones. The middle-frequency category with classical idioms such as a bitter pill or carved in stone comprised 34.9% of the English, 33.0% of the French, and 24.9% of the German phrases. Most French and German phrases were of low frequency. Latin phrases were found as often as French and more often than German ones in the world-wide web, and exhibited a frequency distribution similar to those of French and German. Frequency distributions yielded three main categories around similar maxima for all four languages, with differing relative proportions. The internet may prove useful for the quantitative comparison of languages.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"81 - 94"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1451206","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44814266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Menzerath-Altmann Law and Prothetic /v/ in Spoken Czech","authors":"Ján Mačutek, J. Chromý, M. Koščová","doi":"10.1080/09296174.2018.1424493","DOIUrl":"https://doi.org/10.1080/09296174.2018.1424493","url":null,"abstract":"Abstract This paper discusses the Menzerath-Altmann law in general at first, then it is shown that the law is valid in spoken Czech. In particular, the relation between word length (measured in the number of syllables) and the mean syllable length (measured in the number of phonemes) is investigated. In addition, we model the relation between the relative occurrence of prothetic /v/ in words and word stems which, according to the official norms of the Czech language, begin with phoneme /o/, and word length in syllables in these words.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"66 - 80"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1424493","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41544845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Stylometric Impacts of Ageing and Life Events on Identity","authors":"D. Kernot, T. Bossomaier, R. Bradbury","doi":"10.1080/09296174.2017.1405719","DOIUrl":"https://doi.org/10.1080/09296174.2017.1405719","url":null,"abstract":"Abstract Using data containing stylometric markers for depression and Alzheimer’s disease, the 45 novels of Iris Murdoch and P.D. James are examined to see if a signature of an individual, their personality, changes over time due to life events and natural ageing. We use variants of the critical slowing down 1-lag autocorrelation and coefficient of skewness techniques with a multivariate identity measure, RPAS to visualize these changes. We find that life events such as depression, anxiety, and Alzheimer’s disease might be identified outside of natural ageing through a tipping point phenomenon. We believe these techniques might be a useful self-help tool to aid in the signalling of depressive episodes, such as averting suicide, and the early identification of Alzheimer’s disease, or for law enforcement personnel monitoring terrorists on watch lists.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"1 - 21"},"PeriodicalIF":1.4,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2017.1405719","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46875671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Is the Menzerath-Altmann Law Specific to Certain Languages in Certain Registers?","authors":"Lirong Xu, Lianzhen He","doi":"10.1080/09296174.2018.1532158","DOIUrl":"https://doi.org/10.1080/09296174.2018.1532158","url":null,"abstract":"ABSTRACT Since its formulation, the Menzerath-Altmann law (MAL) has gone through continuing validation and development when applied to different languages or different language units. However, whether the MAL still holds true irrespective of spoken or written register remains a controversial issue. This article endeavours to re-examine the MAL by investigating the correlation between the length of English sentences (measured in the number of clauses) and their constituting clause length (measured in the number of words) in both academic spoken and written registers. It is observed that the MAL is valid in both registers. Further, the fitted parameter values of the MAL can serve as good predictors for register differentiation.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"187 - 203"},"PeriodicalIF":1.4,"publicationDate":"2018-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1532158","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45103238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Lexical Features of PhD Theses across Disciplines: A Text Mining Approach","authors":"Wei Xiao, S. Sun","doi":"10.1080/09296174.2018.1531618","DOIUrl":"https://doi.org/10.1080/09296174.2018.1531618","url":null,"abstract":"ABSTRACT This study employed a text mining method to investigate the lexical features and their dynamic changes of PhD theses across the natural sciences, social sciences and humanities. Four quantitative indices, i.e. TTR, h-point, R1 and writer’s view, were employed to analyze 150 PhD theses (50 theses from each discipline). Although h-point and writer’s view were found counter-intuitively to show insignificant variation across disciplines, the results of TTR and R1 did reveal sharp contrasts between theses in humanities and natural sciences. While the second half of humanities theses showed a significantly higher level of lexical diversity, indicated by higher TTR, theses in natural sciences tended to be richer in content words in the first half, indicated by a higher R1. Meanwhile, theses in social sciences seemed to be more moderate, with features lying in the middle position. This study has implications not only for the widening of applications of quantitative linguistic methods but also for academic writing (especially PhD thesis writing) instruction and practice.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"114 - 133"},"PeriodicalIF":1.4,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1531618","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49664793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantitative Analysis of Dependency Structures","authors":"Yalan Wang","doi":"10.1080/09296174.2018.1558835","DOIUrl":"https://doi.org/10.1080/09296174.2018.1558835","url":null,"abstract":"Russkom Jazyke vs Problemy I Rešenija) and an unnecessary mixing of transliteration rules (s, sh), and in the Cataloguing in Publication (CIP) one finds an illegible writing of the Russian names of the editors, which is surely not the fault of the editors. In sum, this volume gives a good overview of the current state of the art in empirical linguistics, relying heavily on the particular subset of statistical methods applied. However, the title of the volume is at least partly misleading, since any links to current studies on Russian and quantitative Russian linguistics are missing. The focus is clearly placed on (statistical) corpus linguistics, wherein – as this volume shows – statistical methods can doubtlessly be applied fruitfully, albeit mainly embedded as an inductive tool with which to obtain general empirical tendencies. Taking into account the rich tradition of Russian quantitative linguistics – represented by outstanding scholars such as R. G. Piotrovskij, M. V. Arapov, J. A. Tuldava and J. K. Krylov, among many others – this volume seems to represent a ‘new’ beginning of the application of quantitative methods of Russian, but by no means of post-Soviet and Russian linguistics in general.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"83 - 91"},"PeriodicalIF":1.4,"publicationDate":"2018-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1558835","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48787891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers","authors":"One-Soon Her, Marc Allassonnière-Tang","doi":"10.1080/09296174.2018.1523777","DOIUrl":"https://doi.org/10.1080/09296174.2018.1523777","url":null,"abstract":"ABSTRACT Previous studies demonstrate that morphosyntactic plural markers and the structure of numeral systems have individually strong predictive power with regard to the usage of sortal classifiers in languages. We use these two factors as explanatory variables to train the computational classifier of random forests and evaluate the accuracy of their predictive power when selecting the existence/absence of sortal classifiers as response variable. Our results show that these two factors result in an excellent discrimination performance of random forests, even when taking into account sortal classifiers as an areal feature. However, the correlation between morphosyntactic plural markers and multiplicative bases is weaker than the correlation between sortal classifiers and plural markers plus multiplicative bases. We are thus able to provide novel insights with regard to probabilistic universals on sortal classifiers, and suggest an innovative cross-disciplinary approach to test the effect of implicational universals with computational methods.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"27 1","pages":"113 - 93"},"PeriodicalIF":1.4,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1523777","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43034123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Readability Analysis of Bengali Literary Texts","authors":"Shanta Phani, S. Lahiri, A. Biswas","doi":"10.1080/09296174.2018.1499456","DOIUrl":"https://doi.org/10.1080/09296174.2018.1499456","url":null,"abstract":"ABSTRACT In this paper we propose a set of novel regression models for readability scoring in Bengali language, which can also be used for Hindi, making use of several lexical, surface-level, syntactic and semantic features. We perform 5-fold and leave-one-out cross-validation on a human-annotated gold standard dataset of 30 passages, written by 4 eminent Bengali litterateurs. On this dataset, our best model achieves a mean squared error (MSE) of 57%, which is better than state-of-the-art results (73% MSE). We further perform feature analysis to identify potentially useful features in learning a regression model for Bengali readability. Ablation studies indicate the importance of compound characters (Juktakkhors) in readability assessment.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"287 - 305"},"PeriodicalIF":1.4,"publicationDate":"2018-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1499456","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42933734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantitative aspects of the clause: Length, position and depth of the clause","authors":"Haruko Sanada","doi":"10.1080/09296174.2018.1491749","DOIUrl":"https://doi.org/10.1080/09296174.2018.1491749","url":null,"abstract":"ABSTRACT The present study focuses on the quantitative aspects of clauses related to the empirical study on valency. We employed the length of clause, the position of clause, and the depth of clause as the linguistic entities. It can be observed that there are relationships with significant functions among these entities. A relationship between the position and the depth of clause obeys Köhler’s model while a relationship between the length and the position of the clause shows opposite functions to Köhler’s model. A relationship between the depth and the length of the clause shows a decreasing function. However, length can be affected by other entities. The method of measuring entities, e.g. a position of the clause in the sentence must be reconsidered.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"26 1","pages":"306 - 329"},"PeriodicalIF":1.4,"publicationDate":"2018-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1491749","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43648888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}