{"title":"The rise of colligations","authors":"Olav Hackstein, Ryan Sandell","doi":"10.1075/ijcl.20022.hac","DOIUrl":"https://doi.org/10.1075/ijcl.20022.hac","url":null,"abstract":"\u0000This article examines the lexically parallel English and German constructions can’t stand somebody/something and jemanden/etwas nicht ausstehen können “not tolerate (someone or something)”, from synchronic, diachronic, and quantitative perspectives. Syntactic and semantic restrictions suggest that the usage of stand and ausstehen in the relevant sense is older than other semantically similar verbs (e.g. English tolerate, German leiden), while quantitative evidence from corpora shows that the can’t stand and nicht ausstehen können constructions are both colligationally stronger than lexical competitors. Evidence from the history of stand indicates that the lexeme stand in the Germanic and other Indo-European languages has a long history of being employed in the relevant sense. The restrictions on usage and the colligational strength of the respective English and German constructions are thus argued to result from the antiquity of the construction and functional competition from other lexemes.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45803461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ben Naismith, Alan Juffs, Na-Rae Han, Daniel Zheng
{"title":"Handle it in-house?","authors":"Ben Naismith, Alan Juffs, Na-Rae Han, Daniel Zheng","doi":"10.1075/ijcl.20024.nai","DOIUrl":"https://doi.org/10.1075/ijcl.20024.nai","url":null,"abstract":"Vocabulary lists of high-frequency lexical items are an important resource in language education and a key product of corpus research. However, no single vocabulary list will be useful for every learning context, with the appropriateness of such lists affected by the corpora on which they are based. This paper investigates the impact of corpus selection on one measure of lexical sophistication, Advanced Guiraud, focusing on two frequency lists originating from an in-house learner corpus (PELIC) and a global learner corpus (Cambridge Learner Corpus). This analysis shows that frequency lists derived from both types of learner corpus can effectively serve as the basis for measuring the development of lexical sophistication, regardless of the specific program of the learners. Therefore, publicly available learner corpus frequency lists can be a reliable resource for stakeholders interested in the lexical gains of language learners.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"90 ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Egbert & Baker (2019): Using Corpus Methods to Triangulate Linguistic Analysis","authors":"L. Anthony","doi":"10.1075/ijcl.00048.ant","DOIUrl":"https://doi.org/10.1075/ijcl.00048.ant","url":null,"abstract":"","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":"1 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41640039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lectal contamination","authors":"Dirk Pijpops","doi":"10.1075/ijcl.20040.pij","DOIUrl":"https://doi.org/10.1075/ijcl.20040.pij","url":null,"abstract":"\u0000 This paper presents evidence from both corpora and agent-based simulation for the effect of lectal contamination.\u0000 By doing so, it shows how agent-based simulation can be used as a complementary technique to corpus research in the study of\u0000 language variation. Lectal contamination is an effect whereby the words that are typical of a language variety more often appear\u0000 in a morphosyntactic variant typical of that same variety, even among language use from a different variety. This study looks at\u0000 the Dutch partitive genitive construction, which exhibits variation between a “Netherlandic” variant with -s\u0000 ending and a “Belgian” variant without -s ending. It is shown that the probability of the Belgian variant without\u0000 -s increases among more “Belgian” words, in the language use of both Belgians and people from the Netherlands. Meanwhile, an\u0000 agent-based simulation reveals the crucial theoretical preconditions that lead to this effect.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45538571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Carrió-Pastor (2020): Corpus Analysis in Different Genres: Academic Discourse and Learner Corpora","authors":"Shuqiong Wu","doi":"10.1075/ijcl.00049.wu","DOIUrl":"https://doi.org/10.1075/ijcl.00049.wu","url":null,"abstract":"","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48933416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Use words, not constructions!","authors":"Thomas Proisl","doi":"10.1075/ijcl.20072.pro","DOIUrl":"https://doi.org/10.1075/ijcl.20072.pro","url":null,"abstract":"\u0000The aim of collostructional analysis or, more precisely, simple collexeme analysis, is to quantify the statistical association between a construction c and a lexeme l that occurs in a particular slot of the construction. The analysis is based on 2×2 contingency tables that ought to represent a cross-classification of the units of analysis. So far, the units of analysis have been identified either as all constructions in the corpus or all instances of a class C of constructions to which construction c belongs. In practice, it is often not possible or feasible to identify these constructions. Therefore, the sample size is typically approximated by heuristic estimates. The bottom-right cell of the contingency table is most affected by these approximations. I suggest that the units of analysis be defined on the word level, instead, as the class W of word forms that satisfy the restrictions on the collexeme slot of c.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44619049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the impact of lexical context on word association responses","authors":"P. Thwaites","doi":"10.1075/ijcl.20102.thw","DOIUrl":"https://doi.org/10.1075/ijcl.20102.thw","url":null,"abstract":"\u0000In word association tasks, participants respond with the first word that comes to mind on seeing a given cue. These responses are generally assumed to be influenced by a number of factors, including cue semantics, form, and textual distribution. Previous studies exploring the third of these influences have used pairwise association measures, such as mutual information, to evaluate the extent to which textual distributions influence response selection. In the current paper, a different approach is taken. Rather than examining co-occurrences between a cue and its observed responses, this paper explores the possibility that the cue’s holistic collocational environment shapes its associative profile. Regression modelling demonstrates that the predictability of this textual distribution is a significant predictor of variance in the cue’s response profile. Overall, however, the amount of variance explained is small. A subsequent qualitative examination of distributional and associative profiles suggests several semantically based constraints to response generation.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49042248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teodora Vuković, Anastasia Escher, Barbara Sonnenhauser
{"title":"Degrees of non-standardness","authors":"Teodora Vuković, Anastasia Escher, Barbara Sonnenhauser","doi":"10.1075/ijcl.20014.vuk","DOIUrl":"https://doi.org/10.1075/ijcl.20014.vuk","url":null,"abstract":"\u0000 A corpus-based method for assessing a range of dialect-standard variation is presented for identifying samples\u0000 exhibiting the highest prevalence of dialect features. This method provides insight into areal and inter-speaker variation and\u0000 allows the extraction of maximally non-standard manifestations of the dialect, which may then be sampled and used for the study of\u0000 language change and variation. The focus is on a non-standard Torlak variety, which has undergone considerable change under the\u0000 influence of standard Serbian. The degree of variation is assessed by measuring the frequencies of five distinguishing linguistic\u0000 features: accent position, dative reflexive si, auxiliary omission in the compound perfect, the post-positive\u0000 article, and analytic case marking in the indirect object and possessive. Locations subject to the greatest and least influence of\u0000 the standard are revealed using hierarchical clustering. A positive correlation between the frequencies of occurrence reveals\u0000 which non-standard feature is the best predictor of the others.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45507128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multi-dimensional comparison of the effectiveness and efficiency of association measures in collocation\u0000 extraction","authors":"Yaochen Deng, Dilin Liu","doi":"10.1075/ijcl.19111.den","DOIUrl":"https://doi.org/10.1075/ijcl.19111.den","url":null,"abstract":"\u0000 Because of the ubiquity and importance of collocations in language use/learning, how to effectively and\u0000 efficiently identify collocations has been a topic of interest. Although some studies have evaluated many of the existing\u0000 association measures (AMs) used in the automatic identification of collocations, the results so far have been inconsistent and\u0000 unclear due to various limitations of the existing studies. Hence, this study makes a multi-dimensional evaluation of the\u0000 effectiveness and efficiency of seven major AMs in the identification of three types of collocations across five genres and seven\u0000 corpora of different sizes. The results indicate that while a few AMs, such as Log Likelihood Ratio and Cubic Mutual Information\u0000 (MI3), are consistently more effective and efficient than the other five AMs being examined, no one AM alone may be\u0000 adequate in the identification of different types of collocations across different genres and corpus sizes. Research implications\u0000 are also discussed.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.0,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44397793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}