{"title":"The red dress is cute: why subjective adjectives are more often predicative","authors":"Lelia Glass","doi":"10.1515/cllt-2024-0044","DOIUrl":"https://doi.org/10.1515/cllt-2024-0044","url":null,"abstract":"Which adjectives tend to occur as attributive (<jats:italic>the cute/red dress</jats:italic>) versus predicative (<jats:italic>the dress is cute/red</jats:italic>) and why? Building on findings from Wiegand et al. (2013. Predicative adjectives: An unsupervised criterion to extract subjective adjectives. In Lucy Vanderwende, Hal DauméIII & Katrin Kirchhoff (eds.), <jats:italic>Proceedings of the 2013 conference of the North American chapter of the </jats:italic> <jats:italic>Association for Computational Linguistics</jats:italic> <jats:italic>: Human language technologies (NAACL-HLT)</jats:italic>, 534–539. Atlanta, GA: Association for Computational Linguistics) and Vartiainen (2013. Subjectivity, indefiniteness and semantic change. <jats:italic>English Language and Linguistics</jats:italic> 17(1). 157–179), this paper argues that subjective adjectives such as <jats:italic>cute</jats:italic> tend to be placed in predicative position not just because they often describe discourse-new information, but because this position serves to foreground information that the hearer may disagree with. This claim is supported using data from the Corpus of Contemporary American English (Davies, Mark. 2008. <jats:italic>The corpus of contemporary American English: One billion words, 1990-present</jats:italic>. Available at: <jats:ext-link xmlns:xlink=\"http://www.w3.org/1999/xlink\" ext-link-type=\"uri\" xlink:href=\"https://www.english-corpora.org/coca/\">https://www.english-corpora.org/coca/</jats:ext-link>) combined with human annotations for subjectivity from Scontras et al. (2017. Subjectivity predicts adjective ordering preferences. <jats:italic>Open Mind</jats:italic> 1(1). 53–66) <jats:italic>et seq.</jats:italic>; and data from image captions versus descriptions (for seeing versus low-vision people) from the National Gallery of Art. A production experiment manipulates the discourse context to further show that adjectives tend to be placed in predicative position when they express controversial information. Overall, this paper explores how the lexical semantics of adjectives shapes the pragmatic contexts in which they tend to be used, which in turn shapes the syntax of the sentences using them.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"20 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A corpus-based study on semantic and cognitive features of bei sentences in Mandarin Chinese","authors":"Yonghui Xie, Ruochen Niu, Haitao Liu","doi":"10.1515/cllt-2024-0031","DOIUrl":"https://doi.org/10.1515/cllt-2024-0031","url":null,"abstract":"<jats:italic>Bei</jats:italic> sentences in Mandarin Chinese with SOV word order have attracted extensive interest. However, their semantic features lacked quantitative evidence and their cognitive features received insufficient attention. Therefore, the current study aims to quantitatively investigate the semantic and cognitive features through the analysis of nine annotated factors in a corpus. The results regarding <jats:italic>bei</jats:italic> sentences show that (i) subjects exhibit a tendency to be definite and animate; non-adversative verbs have gained popularity over time, and intransitive verbs are capable of taking objects; (ii) subject relations tend to be long, implying heavy cognitive load, whereas the dependencies governed by subjects are often short, suggesting light cognitive load; and (iii) certain semantic factors significantly impact cognitive factors; for instance, animate subjects tend to govern shorter dependencies. Overall, our study provides empirical support for the semantic features of <jats:italic>bei</jats:italic> sentences and reveals their cognitive features using dependency distance.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"10 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Verb influence on French wh-placement: a parallel corpus study","authors":"Jan Fliessbach, Johanna Rockstroh","doi":"10.1515/cllt-2024-0001","DOIUrl":"https://doi.org/10.1515/cllt-2024-0001","url":null,"abstract":"Our study investigates the effect of French verb lemmata on the preverbal (QV) or postverbal (VQ) positioning of interrogative forms equivalent to English ‘what’ (<jats:italic>que</jats:italic>, <jats:italic>quoi</jats:italic>, and related forms) within a French–Spanish parallel corpus of subtitles. We highlight and illustrate the corpus’s utility for studying less frequent verbs in combination with specific <jats:italic>wh</jats:italic>-forms. Our findings suggest that less frequent French verbs exhibit weaker associations with QV compared to their more frequent counterparts. A post-hoc study using Spanish translations reveals that French verbs correlated with QV often denote observable actions involving directly accessible Q-referents. We hypothesise that queries concerning ‘situationally accessible’ referents are predominantly utilised for non-standard, evaluative, or challenging questions, which are typically QV in French.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"23 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Idiosyncratic entrenchment: tracing change in constructional schematicity with nested random effects","authors":"Svetlana Vetchinnikova","doi":"10.1515/cllt-2023-0092","DOIUrl":"https://doi.org/10.1515/cllt-2023-0092","url":null,"abstract":"Usage-based constructionist approaches see language as an inventory of constructions at different levels of schematicity learned from the input. If so, personal constructicons should vary as a function of usage. Repeated use and chunking/entrenchment of concrete instances should lead to reanalysis of their internal structure and change in the level of schematicity. This paper exploits the reduction probability of <jats:italic>is</jats:italic> in <jats:italic>it is</jats:italic> as a diagnostic of reanalysis in a 1.75-million-word diachronic corpus of a single blogger over 8 years. All instances of <jats:italic>it is/it’s</jats:italic> (n = 10,929) were annotated at the constructional and lexical levels. A multilevel logistic regression model showed significant fixed effects of constructional entropy and construction-to-word association on reduction probability. Importantly, there remained substantial variation across lexical types of constructions in the extent to which they associated or became associated with reduction over time, suggesting idiosyncratic entrenchment and potential reanalysis as a function of usage.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"44 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transfer five ways: applications of multiple distinctive collexeme analysis to the dative alternation in Mandarin Chinese","authors":"Shengyu Liao, Stefan Th. Gries, Stefanie Wulff","doi":"10.1515/cllt-2024-0033","DOIUrl":"https://doi.org/10.1515/cllt-2024-0033","url":null,"abstract":"The dative alternation has been extensively studied in the world’s languages, and the meanings of the verbs participating in the alternation have been shown to play a key role in determining its argument realization options. The present paper presents a multiple distinctive collexeme analysis approach to the dative alternation in Mandarin Chinese, which involves a choice of one of five functionally similar alternants, and it does so by also discussing several ways to improve how this has been done statistically in most previous analyses. Linguistically, we identify the core semantic differences of the five constructions based on which verbs statistically prefer to occur in which pattern, focusing on semantic potential and direction of transfer. Methodologically, this study contributes to the slowly growing body of studies that use collexeme strengths that are not only less related to frequency than the traditional methods (i.e., association is measured in a less diluted way) and that are directional (i.e., we can focus on one direction of association from the verb to the construction).","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"94 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesse Egbert, Douglas Biber, Daniel Keller, Marianna Gracheva
{"title":"Register and the dual nature of functional correspondence: accounting for text-linguistic variation between registers, within registers, and without registers","authors":"Jesse Egbert, Douglas Biber, Daniel Keller, Marianna Gracheva","doi":"10.1515/cllt-2024-0011","DOIUrl":"https://doi.org/10.1515/cllt-2024-0011","url":null,"abstract":"During the past 20 years, corpus linguistic research on register variation has yielded important theoretical advances. The first part of this paper discusses these advances and the cumulative body of research that has produced them. In the second part of the paper, we focus on the goals of research on register variation. The traditional goal of the text-linguistic (TxtLx) approach to linguistic variation has been to describe registers and patterns of register variation: describing the linguistic and situational characteristics of registers. In this paper, we explore a related, but distinct, text-linguistic goal: to account for all linguistic variation among texts. Because the TxtLx framework assumes the importance of <jats:italic>functional correspondence</jats:italic> between linguistic characteristics and situational characteristics, it is reasonable to assume that in addition to register, we can use situational parameters coded continuously at the level of individual texts as additional predictors of text-linguistic variation. We describe the results of an empirical study to show that using both register categories and text-level situational parameters as predictors results in a more comprehensive and explanatory model of text-linguistic variation. In the conclusion we discuss the future of corpus-based register studies, focusing on unanswered questions related to theoretical claims about register.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"38 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141062280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learner corpus research: a critical appraisal and roadmap for contributing (more) to SLA research agendas","authors":"Magali Paquot","doi":"10.1515/cllt-2024-0014","DOIUrl":"https://doi.org/10.1515/cllt-2024-0014","url":null,"abstract":"Over the past decade, learner corpora have gained recognition as valuable data sources in Second Language Acquisition (SLA) research. This development can be attributed to significant progress in Learner Corpus Research (LCR). However, there is still substantial work to be done. This article highlights key issues essential for sustaining the relevance of learner corpora in SLA. More particularly, I focus on the need for more diverse types of learner corpora, stress the importance of detailed metadata, and advocate for multifactorial study designs. I then revisit ongoing debates regarding the role of the native speaker in LCR and propose a practical solution to address this thorny issue. Finally, I also readdress the need for improvement in the quantitative methods and statistics, arguing that the importance of robust quantitative analysis cannot be overstated. In conclusion, I envision an ambitious learner corpus compilation project that adheres to the FAIR principles, with the goal of further elevating study quality in LCR.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"129 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140931112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Gamboa, Kristina Braun, Juhani Järvikivi, Shanley E. M. Allen
{"title":"The distributional properties of long nominal compounds in scientific articles: an investigation based on the uniform information density hypothesis","authors":"John Gamboa, Kristina Braun, Juhani Järvikivi, Shanley E. M. Allen","doi":"10.1515/cllt-2023-0028","DOIUrl":"https://doi.org/10.1515/cllt-2023-0028","url":null,"abstract":"Nominal compounds are a structure commonly used in scientific texts. Despite their commonality, very little is known about how they are distributed in scientific articles. Based on the Uniform Information Density hypothesis, which states that speakers communicate information at a constant rate, avoiding peaks and troughs of information transmission, we predict that nominal compounds should cluster toward the end of scientific texts, be preceded by supporting text that facilitates their understanding, and be repeated often after their first use. In this paper, we examine these predictions through a quantitative and a qualitative analysis of a corpus of scientific papers from the fields of Biology, Economics and Linguistics. While our investigation did not reveal definitive findings for the first and third predictions above, it did produce supporting evidence in favor of our second prediction, thus advancing our understanding of NC use and the choices speakers make when transmitting information.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"56 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140609182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Monika Bednarek, Martin Schweinberger, Kelvin K. H. Lee
{"title":"Corpus-based discourse analysis: from meta-reflection to accountability","authors":"Monika Bednarek, Martin Schweinberger, Kelvin K. H. Lee","doi":"10.1515/cllt-2023-0104","DOIUrl":"https://doi.org/10.1515/cllt-2023-0104","url":null,"abstract":"Recent years have seen an increase in data and method reflection in corpus-based discourse analysis. In this article, we first take stock of some of the issues arising from such reflection (covering concepts such as triangulation, objectivity/subjectivity, replication, transparency, reflexivity, consistency). We then introduce a new ‘accountability’ framework for use in corpus-based discourse analysis (and perhaps beyond). We conceptualise such accountability as a multi-faceted phenomenon, covering various aspects of the research process. In the second part of this article, we then link this framework to a new cross-institutional initiative – the Australian Text Analytics Platform (ATAP) – which aims to address a small part of the framework, namely the transparency of analyses through Jupyter notebooks. We introduce the Quotation Tool as an example ATAP notebook of particular relevance to corpus-based discourse analysis. We reflect on how this notebook fosters accountability in relation to transparency of analysis and illustrate key applications using a set of different corpora.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"25 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140609022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A collostructional approach to Japanese noun-modifying clause construction use and acquisition: a learner corpus study","authors":"Nicole C. De Los Reyes, Ute Römer-Barron","doi":"10.1515/cllt-2024-0020","DOIUrl":"https://doi.org/10.1515/cllt-2024-0020","url":null,"abstract":"Japanese features a general noun-modifying clause construction (NMCC) with a more versatile range of semantic and pragmatic interpretations than equivalent constructions in other languages. Motivated by the learning challenge NMCCs pose to Japanese as a foreign language (JFL) learners, this article examines speech data from the International Corpus of Japanese as a Second Language (I-JAS) to compare learner use of NMCCs against a large L1 Japanese corpus. Instances of the construction from both corpora were analyzed to identify high-frequency part-of-speech categories and subcategories in the modifying clause predicate and head noun slots. A simple collexeme analysis was then employed to identify strongly attracted and repelled lexical items among those identified in realizations of the construction. Taken together, findings from these analyses revealed an important connection between the semantic weight of head nouns in NMCCs and the idiomaticity of the construction, with learner productions demonstrating a tendency toward heavy head nouns. This study lays the groundwork for future research seeking to explore the NMCC at different levels of granularity and to improve its treatment in JFL pedagogical materials.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"30 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140196917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}