{"title":"Present perfect and preterit variation in the Spanish of Lima and Mexico city: findings from a corpus analysis","authors":"Anna Mastrantuono, Brendan Regan","doi":"10.1515/cllt-2022-0060","DOIUrl":"https://doi.org/10.1515/cllt-2022-0060","url":null,"abstract":"Abstract In many languages, the present perfect has grammaticalized, gradually displacing the preterit. Within Spanish, this has been documented with the grammaticalization of the present perfect in Peninsular Spanish. To examine this possibility in two Latin American varieties, this study examined present perfect/preterit variation of 36 speakers from Lima and Mexico City from the PRESEEA corpus. While Lima Spanish presented overall more present perfect than Mexico City Spanish, a similar internal constraint hierarchy is predictive of present perfect use in both speech communities. However, Lima Spanish demonstrated a change in progress toward an expansion of the preterit among younger speakers with the indeterminate temporal reference as locus of change. The findings suggest that present perfect grammaticalization may not always be the most common cross-linguistic pathway but rather is subject to source constraints, which may lead to another pathway in which the preterit expands at the expense of the present perfect.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44220873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The linguistic organization of grammatical text complexity: comparing the empirical adequacy of theory-based models","authors":"D. Biber, Tove Larsson, G. Hancock","doi":"10.1515/cllt-2023-0016","DOIUrl":"https://doi.org/10.1515/cllt-2023-0016","url":null,"abstract":"Abstract Although there is a long tradition of research analyzing the grammatical complexity of texts (in both linguistics and applied linguistics), there is surprisingly little consensus on the nature of complexity. Many studies have disregarded syntactic (and structural) distinctions in their analyses of grammatical text complexity, treating it instead as if it were a single unified construct. However, other corpus-based studies indicate that different grammatical complexity features pattern in fundamentally different ways. The present study employs methods that are informed by structural equation modeling to test the goodness-of-fit of four models that can be motivated from previous research and linguistic theory: a model treating all complexity features as a single dimension, a model distinguishing among three major structural types of complexity features, a model distinguishing among three major syntactic functions of complexity features, and a model distinguishing among nine combinations of structural type and syntactic functions. The findings show that text complexity is clearly a multi-dimensional construct. Both structural and syntactic distinctions are important. Syntactic distinctions are actually more important than structural distinctions, although the combination of the two best accounts for the ways in which complexity features pattern in texts from different registers.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48874237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The blurring of the boundaries: changes in verb/noun heterosemy in Recent English","authors":"Bin Shao, Jing Zheng, Hendrik De Smet","doi":"10.1515/cllt-2022-0053","DOIUrl":"https://doi.org/10.1515/cllt-2022-0053","url":null,"abstract":"Abstract Conversion is a common feature of present-day English, leading to many ‘heterosemous’ words that express related meanings across multiple word classes. Especially common is verb/noun heterosemy, as in flow or hand, both of which can be used as verbs or as nouns. The prevalence of verb/noun heterosemy sets English apart from closely related Germanic languages and is one respect in which English behaves as a language with high boundary permeability. This paper investigates how verb/noun heterosemy has been evolving in Recent English (1920s–2010s). Using quantitative analysis within a large sample of 877 heterosemous words, it is shown that associations between specific words and word classes have been weakening over the last century. More precisely, within our sample, heterosemous words on average tend to develop towards more balanced heterosemy, whereby their association to either one word class or another becomes less pronounced. The findings suggest that English is in the process of a long-term drift towards greater boundary permeability. As high boundary permeability has been associated with low reliance on inflectional morphology in a language, this could be a long-term consequence of the overall loss of inflections earlier in the history of the language.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44874543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Let my speakers talk: metalinguistic activity can indicate semantic change","authors":"Israela Becker","doi":"10.1515/cllt-2023-0022","DOIUrl":"https://doi.org/10.1515/cllt-2023-0022","url":null,"abstract":"Abstract In the absence of a diachronic corpus or a synchronic corpus tagged for speakers’ age, substantiating the presence of semantic change and the stage of change ― initial or advanced ― are challenging tasks. In the present study I introduce three methods for overcoming such difficulties by extracting various kinds of evidence from a synchronic corpus not tagged for speakers’ age. All three methods are based on speakers’ metalinguistic activity. Two of them are of a psycholinguistic nature and the third is of a sociolinguistic nature. Not only do these methods provide data hitherto overlooked by researchers for detecting semantic change, but they can also minimize the researchers’ need for interpretative interventions with regard to speakers’ communicative intentions, thus improving the quality of the analysis.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49002825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multifactorial aspectual analysis of verb concatenation with imperfective markers zhe in Mandarin","authors":"Junjie Jin, F. Li","doi":"10.1515/cllt-2022-0080","DOIUrl":"https://doi.org/10.1515/cllt-2022-0080","url":null,"abstract":"Abstract As a cognitive ability to construe events in alternate ways, aspectuality has aroused many researchers’ academic attention; however, the concatenation of aspect markers in a clause is understudied in previous studies. The present paper follows a bidimensional approach of aspect to conduct a corpus-based aspectual analysis of verb concatenation with imperfective markers zhe (henceforth VCIMs zhe) in Mandarin. Specifically, to construe the cognitive inference mechanism of aspect, a multifactorial analysis of VCIMs zhe by the statistical techniques of multiple correspondence analysis, conditional inference trees and conditional random forests is carried out to explore the prototypical temporal features of verbs in two slots, predict the aspectual meanings of two imperfective markers zhe, and also discuss the conditional importance of factors such as durativity, dynamicity, telicity, boundedness, and slot in identifying the situation types of two verbs or verb phrases in VCIMs zhe. Methodologically, a usage-based multifactorial analysis of VCIMs zhe complements previous introspective studies on aspect marking. Theoretically, a corpus-based aspectual account of VCIMs zhe, one type of complex viewpoint aspects, expands traditional studies on Chinese aspect system, supplies evidence for aspect typology cross-linguistically, and provides reference for second language acquisition of usage patterns of zhe by non-native speakers.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47468703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aleksandrs Berdicevskis, E. Coussé, Alexander Koplenig, Yvonne Adesam
{"title":"To drop or not to drop? Predicting the omission of the infinitival marker in a Swedish future construction","authors":"Aleksandrs Berdicevskis, E. Coussé, Alexander Koplenig, Yvonne Adesam","doi":"10.1515/cllt-2022-0101","DOIUrl":"https://doi.org/10.1515/cllt-2022-0101","url":null,"abstract":"Abstract We investigate the optional omission of the infinitival marker in a Swedish future tense construction. During the last two decades the frequency of omission has been rapidly increasing, and this process has received considerable attention in the literature. We test whether the knowledge which has been accumulated can yield accurate predictions of language variation and change. We extracted all occurrences of the construction from a very large collection of corpora. The dataset was automatically annotated with language-internal predictors which have previously been shown or hypothesized to affect the variation. We trained several models in order to make two kinds of predictions: whether the marker will be omitted in a specific utterance and how large the proportion of omissions will be for a given time period. For most of the approaches we tried, we were not able to achieve a better-than-baseline performance. The only exception was predicting the proportion of omissions using autoregressive integrated moving average models for one-step-ahead forecast, and in this case time was the only predictor that mattered. Our data suggest that most of the language-internal predictors do have some effect on the variation, but the effect is not strong enough to yield reliable predictions.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42350433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Frontmatter","authors":"","doi":"10.1515/cllt-2023-frontmatter2","DOIUrl":"https://doi.org/10.1515/cllt-2023-frontmatter2","url":null,"abstract":"","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136272042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of keyness metrics: performance and reliability","authors":"Lukas Sönning","doi":"10.1515/cllt-2022-0116","DOIUrl":"https://doi.org/10.1515/cllt-2022-0116","url":null,"abstract":"Abstract The methodological debates surrounding keyword analysis have given rise to a wide range of keyness metrics. The present paper delineates four dimensions of keyness, which distinguish between frequency- and dispersion-related perspectives. Existing measures are then organized according to these dimensions and evaluated with regard to their performance on a specific keyword analysis task: The identification of key verbs in academic writing. To this end, the rankings produced by 32 different metrics are evaluated against an established academic word list. Further, the reliability of measures is assessed, to determine whether they produce stable rankings across repeated studies on the same pair of text varieties. We observe notable differences among metrics with regard to these criteria. Our findings provide further support for the superiority of the Wilcoxon rank sum test and text-dispersion–based measures, and allow us to identify, within each dimension of keyness, metrics that may be given preference in applied work.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43362274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Seeing the wood for the trees: predictive margins for random forests","authors":"Lukas Sönning, Jason Grafmiller","doi":"10.1515/cllt-2022-0083","DOIUrl":"https://doi.org/10.1515/cllt-2022-0083","url":null,"abstract":"Abstract Classification trees and random forests offer a number of attractive features to corpus data analysts. However, the way in which these models are typically reported – a decision tree and/or set of variable importance scores – offers insufficient information if interest centers on the (form of) relationship between (multiple) predictors and the outcome. This paper develops predictive margins as an interpretative approach to ensemble techniques such as random forests. These are model summaries in the form of adjusted predictions, which provide a clearer picture of patterns in the data and allow us to query a model on potential nonlinear associations and interactions among predictor variables. The present paper outlines the general strategy for forming predictive margins and addresses methodological issues from an explicitly (corpus) linguistic perspective. For illustration, we use data on the English genitive alternation and provide an R package and code for their implementation.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"0 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41334909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A corpus-based quantitative study of numeral classifiers in Nepali","authors":"Krishna Prasad Parajuli, Marc Allassonnière-Tang","doi":"10.1515/cllt-2022-0064","DOIUrl":"https://doi.org/10.1515/cllt-2022-0064","url":null,"abstract":"Abstract Nepali is typologically rare in terms of nominal classification systems, as it is one of the few languages of the world having simultaneously two gender systems (human/non-human, masculine/feminine) and one numeral classifier system (distinguishing features such as human, round-shaped objects, and long objects among others). Such a rare co-occurrence of different nominal classification systems is highly relevant for investigating linguistic complexity, as languages generally do not have several systems of the same type fulfilling the same functions. However, no corpus-based quantitative analyses have been conducted on the productive use of nominal classification systems in Nepali. The current paper aims at filling this gap by providing a token-based study from the Nepali National Corpus (∼20 million words). Our preliminary results show that there is in fact little formal overlap between the classifier and the gender systems.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43975397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}