Lukas Sönning, Manfred Krug, Fabian Vetter, Timo Schmid, Anne Leucht, Paul Messer
{"title":"Latent-Variable Modelling of Ordinal Outcomes in Language Data Analysis","authors":"Lukas Sönning, Manfred Krug, Fabian Vetter, Timo Schmid, Anne Leucht, Paul Messer","doi":"10.1080/09296174.2024.2329448","DOIUrl":"https://doi.org/10.1080/09296174.2024.2329448","url":null,"abstract":"In empirical work, ordinal variables are typically analysed using means based on numeric scores assigned to categories. While this strategy has met with justified criticism in the methodological li...","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"63 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140583169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corrections to Nelson (2023): DPnorm and DKLnorm are Not Wrong on Pi at All","authors":"Stefan Th Gries","doi":"10.1080/09296174.2024.2324616","DOIUrl":"https://doi.org/10.1080/09296174.2024.2324616","url":null,"abstract":"This paper mainly discusses two computational errors in Nelson (2023), which demonstrate that part of his conclusions regarding two dispersion measures are flawed.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"52 11 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140297981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multifractal Analysis of the Distribution of Three Grammatical Constructions in English Texts","authors":"Rosmawati, Wander Lowie","doi":"10.1080/09296174.2024.2302674","DOIUrl":"https://doi.org/10.1080/09296174.2024.2302674","url":null,"abstract":"Both the Menzerath-Altmann law and the Zipf-Mandelbrot law note that language is a fractal structure and, like any other fractals, follows power laws. Studies on fractal linguistics demonstrated th...","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"113 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139752644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantitative Approaches to Universality and Individuality in Language","authors":"Wei Huang, Tenghao Ji","doi":"10.1080/09296174.2023.2294786","DOIUrl":"https://doi.org/10.1080/09296174.2023.2294786","url":null,"abstract":"Published in Journal of Quantitative Linguistics (Ahead of Print, 2023)","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"98 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138826658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Current State and Prominent Features of Quantitative Linguistics Through the Lens of QUALICO 2023: A Conference Report","authors":"Jianwei Yan","doi":"10.1080/09296174.2023.2283932","DOIUrl":"https://doi.org/10.1080/09296174.2023.2283932","url":null,"abstract":"Quantitative Linguistics (QL) is an academic field that employs quantitative and statistical methods to explore language patterns and linguistic laws. From June 28th to 30th, 2023, the Internationa...","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"51 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Text Segmentation Via Processes that Count the Number of Different Words Forward and Backward","authors":"Berhane Abebe, Mikhail Chebunin, Artyom Kovalevskii","doi":"10.1080/09296174.2023.2275342","DOIUrl":"https://doi.org/10.1080/09296174.2023.2275342","url":null,"abstract":"ABSTRACTThe paper is developing a new statistical approach to automatic partitioning of texts into parts belonging to different authors. It is based on the analysis of processes that counts the number of different words forward and backward. The theoretical study of the processes is based on the assumptions of an elementary probability model with a change point. We prove consistence of our statistical estimate of the point of concatenation in the case when the concatenated texts have different Zipf exponents. This method is being tested on the Brown corpus and also on newspaper texts in different languages. Testing shows a good estimate of the concatenation point. This method can be used in parallel with other text segmentation methods. AcknowledgmentsThe authors like to thank anonymous referees for their helpful and constructive comments and suggestions.Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementWe used texts from open sources.Additional informationFundingThe work was supported by the Siberian Branch, Russian Academy of Sciences [FWNF-2022-0010].","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"79 19","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135037122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Word Length in Chinese: The Menzerath-Altmann Law is Valid After All","authors":"Tereza Motalová, Ján Mačutek, Radek Čech","doi":"10.1080/09296174.2023.2259937","DOIUrl":"https://doi.org/10.1080/09296174.2023.2259937","url":null,"abstract":"ABSTRACTAccording to the Menzerath-Altmann law, longer language constructs consist, on average, of shorter constituents. It is most often studied at the level of words and syllables (the mean syllable length gets shorter with the increasing word length). Its validity at this level was corroborated in several languages. However, it was claimed that Chinese is an exception with respect to the validity of the Menzerath-Altmann law. We show that the law is valid if word types are considered, while the behaviour of word tokens is different. This difference can be explained by the fact that the Zipf law of abbreviation is valid not only for words but also for syllables (shorter syllables are used more frequently).KEYWORDS: word lengthMenzerath-Altmann lawChinesesyllableChinese characters AcknowledgmentsThe work was supported from European Regional Development Fund Project “Sinophone Borderlands – Interaction at the Edges”, CZ.02.1.01/0.0/0.0/16_019/0000791 (T. Motalová), VEGA 2/0096/21 (J. Mačutek), APVV-21-0216 (J. Mačutek), and Operational Programme Integrated Infrastructure (OPII) for the project 313011BWH2: “InoCHF – Research and development in the field of innovative technologies in the management of patients with CHF”, co-financed by the European Regional Development Fund (J. Mačutek).Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1. A more general formula with an additional parameter c, yx=axbecx, is sometimes used, see e.g. Mačutek et al. (Citation2019).2. The MAL has found its place also in research areas outside of human language, such as e.g. music (Boroda & Altmann, Citation1991), animal communication (Gustison et al., Citation2016), and genome structure (Ferrer-I-Cancho et al., Citation2014). The ‘common denominator’ of these branches of science is that they study information flow (in a very general sense).3. Syllable length was measured in moras, not in phonemes.4. In some of the papers cited in this paragraph, the mean syllable length is expressed in the number of graphemes rather than phonemes. The mean syllable length is quite similar for both choices in languages with shallow orthographies (Coulmas, Citation2002).5. Erization is an addition of the r-suffix (儿) to a syllable, e.g. 花 huā becomes 花儿 huār (‘flower’). Moreover, there are a few singular exceptions of polysyllabic characters in Chinese. Qiu (Citation2000, p. 26, 406) mentions 瓩 qiānwǎ ‘kilowatt’, 浬 hǎilǐ ‘nautical mile’, and 哩 yīnglǐ ‘English mile’ (none of these words occurs in our language material).6. Xin Han-Da cidian – Das neue Chinesisch-Deutsche Wörterbuch, 1985. Commercial Press, Beijing.7. In fact, one can speak about phonological words here, see e.g. Hall (Citation1999) or Zsiga (Citation2013, pp. 342–346). Thus, this approach can be considered a study of the MAL on the level of words, albeit from a slightly different perspective.8. Lengths of stress units ranged between 1 and 18 syllables while in the case of rhythmic segm","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"6 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135584918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structural Factor Analysis of Lexical Complexity Constructs and Measures: A Quantitative Measure-Testing Process on Specialised Academic Texts","authors":"Maryam Nasseri, Philip McCarthy","doi":"10.1080/09296174.2023.2258782","DOIUrl":"https://doi.org/10.1080/09296174.2023.2258782","url":null,"abstract":"ABSTRACTThis study evaluates 22 lexical complexity measures that represent the three constructs of density, diversity and sophistication. The selection of these measures stems from an extensive review of the SLA linguistics literature. All measures were subjected to qualitative screening for indicators/predictors of lexical proficiency/development and criterion validity based on the body of scholarship. This study’s measure-testing process begins by dividing the selected measures into two groups, similarly calculated and dissimilarly calculated, based on their quantification methods and the results of correlation tests. Using a specialized corpus of postgraduate academic texts, a Structural Factor Analysis (SFA) comprising a Confirmatory Factor Analysis (CFA) and Exploratory Factor Analysis (EFA) is then conducted. The purpose of SFA is to 1) verify and examine the lexical classifications proposed in the literature, 2) evaluate the relationship between various lexical constructs and their representative measures, 3) identify the indices that best represent each construct and 4) detect possible new structures/dimensions. Based on the analysis of the corpus, the study discusses the construct-distinctiveness of lexical complexity constructs, as well as strong indicators of each conceptual/mathematical group among the measures. Finally, a unique and smaller set of measures representative of each construct is suggested for future studies that require measure selection. AcknowledgmentsWe would like to thank the two anonymous reviewers for their valuable suggestions and comments.Disclosure statementNo potential conflict of interest was reported by the author(s).Credit authorship contribution statementMaryam Nasseri: Conceptualization, Data curation, Methodology, Data analysis and evaluation of findings, Project administration, Visualization, Writing: original draft, Writing: critical review & editing, Funding acquisition.Philip McCarthy: Measure-selection, Writing: critical review & editing, Funding acquisition.Notes1. The lexical sophistication measures in LCA-AW are filtered through the BAWE (British Academic Written English) corpus and its most-frequently-used academic writing words used in linguistics and language studies as well as the general English frequency word lists based on the BNC (the British National Corpus) or ANC (American National Corpus).2. LCA-AW and TAALED calculate the indices based on lemma forms while Coh-Metrix calculates the vocd-D index based on word forms. In the latter case, lemmatized files can be used as the input to Coh-Metrix.3. The R packages used in this study include psych (version 1.8.12, Revelle, Citation2018), lavaan (version 0.5–18, Rosseel, Citation2012) and corrplot (version 0.84, Wei & Simko, Citation2017).Additional informationFundingThis study is part of the “Lexical Proficiency Grading for Academic Writing (FRG23-C-S66)” comprehensive research granted by the American University of Sharjah (AUS).Notes on cont","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"24 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135935652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Words and Numbers. In Memory of Peter Grzybek (1957-2019) <b>Words and Numbers. In Memory of Peter Grzybek (1957-2019)</b> , edited by Emmerich Kelih and Reinhard Köhler, Lüdenscheid, RAM-Verlag, 2020, 248 pp., ISBN 978-3-942303-89-7, 55,00 EUR for the paperback version","authors":"Mengge Wang","doi":"10.1080/09296174.2023.2262696","DOIUrl":"https://doi.org/10.1080/09296174.2023.2262696","url":null,"abstract":"","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136102652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lexical Features and Psychological States: A Quantitative Linguistic Approach","authors":"Xiaowei Du","doi":"10.1080/09296174.2023.2256211","DOIUrl":"https://doi.org/10.1080/09296174.2023.2256211","url":null,"abstract":"ABSTRACTIn recent decades, there has been an increasing interest in the relation between lexical features and texts of psychological states. Previous studies demonstrated that some lexical features varied significantly among the texts of psychological states. However, the lexical features at the textual level have received little attention. This paper extends this work by examining the performance of quantitative linguistic indices in classifying texts of psychological issues. A large dataset of forum posts including texts of anxiety, depression, suicide ideation, and normal states were experimented with Machine Learning algorithms. The results revealed that the quantitative linguistic indices with Machine Learning algorithms achieved a high level of success in identifying psychological states. Meanwhile, some quantitative linguistic indices, namely, ALT and Writer’s view, may extract adequate lexical features for classifying texts of different psychological states. The study is probably the first attempt that uses quantitative linguistic indices as lexical features to detect texts of psychological states, and the findings may contribute to our understanding of how accuracy may be enhanced in the identification of various psychological states. Finally, the implications of these findings are discussed. Publisher’s NoteAll claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.AcknowledgmentsWe thank the JQL referees and the editors for their insightful comments. Their suggestions have significantly enhanced the quality of the initial manuscripts.Disclosure StatementThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.Data Availability StatementPublicly available datasets were analysed in this study. This data can be found here: We used AlMosaiwi and Johnstone’s (2018) dataset which can be accessed at https://doi.org/10.6084/m9.figshare.474 3547.v1.Supplemental dataSupplemental data for this article can be accessed online at https://doi.org/10.1080/09296174.2023.2256211.Notes1. The dataset can be accessed at https://doi.org/10.6084/m9.figshare.4743547.Additional informationFundingThis study was Supported by “the Fundamental Research Funds for the Central Universities” (Grant No. 3132023331).","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135729302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}