IF 0.5

Corpora Pub Date : 2022-04-01 DOI: 10.3366/cor.2022.0234

P. Jurko

{"title":"Semantic prosody of Slovene adverb–verb collocations: introducing the top-down approach","authors":"P. Jurko","doi":"10.3366/cor.2022.0234","DOIUrl":"https://doi.org/10.3366/cor.2022.0234","url":null,"abstract":"This paper presents a corpus-driven Sinclairian analysis of five high-frequency Slovene verbs covering the lexical paradigm ‘to express orally’ in combination with their premodifying adverbs of manner. One of the main goals of the paper is to establish how frequent the phenomenon of semantic prosody actually is among high-frequency lexical items (here, adv-v pairs). A methodology aiming to provide an answer to this question has been proposed featuring the top-down approach (i.e., in order of decreasing frequency of occurrence). It involves setting up the widest possible parameters of searching for so-called ‘extended units of meaning’ and their semantic prosody amongst the most frequent lexical patterns in a language. A total of twenty-six adv-v pairs have been examined. Results indicate a strong correlation between the frequency of multi-word lexical items and their tendency to develop semantic prosodies: high-frequency collocations are thus more likely to have semantic prosodies compared to their lower-frequency counterparts. Overall, results also corroborate the trend of semantic prosody to be found with mainly negative meanings and to a lesser extent in neutral meanings, while no positive semantic prosody has been determined in this study.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42993874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

On the status of statistical reporting versus linguistic description in corpus linguistics: a ten-year perspective 统计报告与语言描述在语料库语言学中的地位:十年展望

IF 0.5

Corpora Pub Date : 2022-04-01 DOI: 10.3366/cor.2022.0238

Tove Larsson, Jesse Egbert, D. Biber

{"title":"On the status of statistical reporting versus linguistic description in corpus linguistics: a ten-year perspective","authors":"Tove Larsson, Jesse Egbert, D. Biber","doi":"10.3366/cor.2022.0238","DOIUrl":"https://doi.org/10.3366/cor.2022.0238","url":null,"abstract":"This study investigates ( i) whether there has been a shift towards increased statistical focus in corpus linguistic research articles, and, if so, ( ii) whether this has had any repercussions for the attention paid to linguistic description. We investigate this through an analysis of the relative focus on statistical reporting versus linguistic description in the way the results are reported and discussed in research articles published in four major corpus linguistics journals in 2009 and 2019. The results display a marked change: in 2009, a clear majority of the articles exhibit a preference for linguistic description over statistical reporting; in 2019, the exact opposite is true. The number of different statistical techniques employed has also gone up. Whilst the increased statistical focus may reflect increased methodological sophistication, our results show that it has come at a cost: a diminished focus on linguistic description, evident, for example, through fewer text excerpts and linguistic examples, which appears to be symptomatic of increasing distance from the language that is the object of study. We discuss these shifts and suggest some ways of employing sophisticated statistical techniques without sacrificing a focus on language.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44253043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Review: Pastor and Colson (eds). 2020. Computational Phraseology 评论:牧师和科尔森(编)。2020. 计算措辞

IF 0.5

Corpora Pub Date : 2022-04-01 DOI: 10.3366/cor.2022.0239

Joe Geluso

引用次数: 0

Review: Egbert and Baker (eds). 2020. Using Corpus Methods to Triangulate Linguistic Analysis 回顾:Egbert和Baker(编)。2020. 用语料库方法进行语言分析

IF 0.5

Corpora Pub Date : 2021-11-01 DOI: 10.3366/cor.2021.0230

Xiaoli Fu

{"title":"Review: Egbert and Baker (eds). 2020. Using Corpus Methods to Triangulate Linguistic Analysis","authors":"Xiaoli Fu","doi":"10.3366/cor.2021.0230","DOIUrl":"https://doi.org/10.3366/cor.2021.0230","url":null,"abstract":"Previous research on methodological triangulation, like Baker and Egbert (2016), has mainly focussed on triangulation within corpus linguistics (CL). This timely volume presents triangulation between corpus linguistic methods and other linguistic methodologies through nine empirical studies in discourse analysis, applied linguistics and psycholinguistics. The volume consists of an introduction, nine chapters grouped into three sections, and a ‘Synthesis and Conclusion’. In the Introduction, the editors briefly introduce CL and methodological triangulation. A brief review of previous literature on triangulation between CL and other linguistic methods in the fields of discourse analysis, applied linguistics and psycholinguistics is then presented. It ends with a sequential introduction to the nine studies in the volume. Part I (Chapters 2 to 4) falls into the area of discourse analysis. To analyse text structure in a corpus of twenty-four academic lectures, in Chapter 2, Erin Schnur and Eniko Csomay employ manual/automatic segmentation and qualitative/quantitative analysis. The first approach involves manual segmentation using Mechanical Turk (MT) and qualitative coding of the 1,056 segments identified based on eight functions. The analysis here focusses on the distribution of segment functions in the texts. In the second approach, 769 Vocabulary-Based Discourse Units are automatically identified with TextTiler and then subjected to quantitative analysis, identifying four text-types of segments with similar linguistic features. Thus, the second case study focusses on the distribution of linguistic patterns in text structure to illustrate the association between language variation and pedagogical purpose. In Chapter 3, Tony McEnery, Helen Baker and Carmen Dayrell rely on an historical newspaper corpus to explore the reality of droughts in nineteenth-century Britain. To control the potential errors in the digitised","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47881288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pinning down the gap: gender and the online representation of professional tennis players 确定差距:性别和职业网球运动员的在线表现

IF 0.5

Corpora Pub Date : 2021-11-01 DOI: 10.3366/cor.2021.0227

A. Yip

{"title":"Pinning down the gap: gender and the online representation of professional tennis players","authors":"A. Yip","doi":"10.3366/cor.2021.0227","DOIUrl":"https://doi.org/10.3366/cor.2021.0227","url":null,"abstract":"Sport is a powerful social institution where hegemonic masculinity is constantly constructed and naturalised through the positioning of physicality and athleticism alongside maleness. Female athletes continue to be sub-ordinated by means of under-representation and trivialising gender discourses. So far, the extensive discussion of gendered language in sports media has primarily focussed on identifying the manifestations of gender bias in traditional news media. There has been little endeavour to explore the language of online media and tournament organisers. This study addresses that gap by comparing online gender representations of tennis players during the Wimbledon Championships 2018 on five online news websites and the tournament website. It also contributes to existing literature by providing corpus evidence of gender bias in sports media. The corpus consists of 1,622 articles (1,076,475 tokens). Findings from frequency, collocation and concordance analysis indicate that despite some instances of gender-neutral representations, female players are prone to gender marking and gender-bland sexism on all websites. I argue that the challenges women face relate to the tension between femininity and athleticism, and the misguided belief that women need to but can never eliminate the muscle gap.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49030958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Separatism: a cross-linguistic corpus-assisted study of word-meaning development in a time of conflict 分离主义:冲突时期跨语言语料库辅助下的词义发展研究

IF 0.5

Corpora Pub Date : 2021-11-01 DOI: 10.3366/cor.2021.0228

Tatyana Karpenko-Seccombe

{"title":"Separatism: a cross-linguistic corpus-assisted study of word-meaning development in a time of conflict","authors":"Tatyana Karpenko-Seccombe","doi":"10.3366/cor.2021.0228","DOIUrl":"https://doi.org/10.3366/cor.2021.0228","url":null,"abstract":"This paper considers the role of historical context in initiating shifts in word meaning. The study focusses on two words – the translation equivalents separatist and separatism – in the discourses of Russian and Ukrainian parliamentary debates before and during the Russian–Ukrainian conflict which emerged at the beginning of 2014. The paper employs a cross-linguistic corpus-assisted discourse analysis to investigate the way wider socio-political context affects word usage and meaning. To allow a comparison of discourses around separatism between two parliaments, four corpora were compiled covering the debates in both parliaments before and during the conflict. Keywords, collocations and n-grams were studied and compared, and this was followed by qualitative analysis of concordance lines, co-text and the larger context in which these words occurred. The results show how originally close meanings of translation equivalents began to diverge and manifest noticeable changes in their connotative, affective and, to an extent, denotative meanings at a time of conflict in line with the dominant ideologies of the parliaments as well as the political affiliations of individuals.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48018322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Exploring and categorising the Arabic copula and auxiliary kāna through enhanced part-of-speech tagging 通过增强词性标注对阿拉伯语联结词和助词kāna进行探索和分类

IF 0.5

Corpora Pub Date : 2021-11-01 DOI: 10.3366/cor.2021.0225

A. Hardie, Wesam M. A. Ibrahim

{"title":"Exploring and categorising the Arabic copula and auxiliary kāna through enhanced part-of-speech tagging","authors":"A. Hardie, Wesam M. A. Ibrahim","doi":"10.3366/cor.2021.0225","DOIUrl":"https://doi.org/10.3366/cor.2021.0225","url":null,"abstract":"Arabic syntax has yet to be studied in detail from a corpus-based perspective. The Arabic copula kāna (‘be’), functions also as an auxiliary, creating periphrastic tense–aspect constructions; but the literature on these functions is far from exhaustive. To analyse kāna within the one-million word Corpus of Contemporary Arabic, part-of-speech tagging (using novel, targeted enhancements to a previously described program which improves the accessibility for linguistic analysis of the output of Habash et al.’s [2012] mada disambiguator for the Buckwalter Arabic morphological analyser) is applied to disambiguate copula and auxiliary at a high rate of accuracy. Concordances of both are extracted, and 10 percent samples (499 instances of copula kāna and 387 of auxiliary kāna) are analysed manually to identify surface-level grammatical patterns and meanings. This raw analysis is then systematised according to the more general patterns’ main parameters of variation; special descriptions are developed for specific, apparently fixed-form expressions (including two phraseologies which afford expression of verbal and adjectival modality). Overall, we uncover substantial new detail, not mentioned in existing grammars (e.g., the quantitative predominance of the past imperfect construction over other uses of auxiliary kāna). There exists notable potential for these corpus-based findings to inform and enhance not only grammatical descriptions but also pedagogy of Arabic as a first or second/foreign language.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47452203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Expanding lindsei to spoken learner English from several L1s across cefr levels 将英语口语从几个15级扩展到英语口语学习者

IF 0.5

Corpora Pub Date : 2021-08-19 DOI: 10.3366/cor.2021.0220

Lan-fen Huang, Tomáš Gráf

引用次数: 1

An algorithm to identify periods of establishment and obsolescence of linguistic items in a diachronic corpus 一种识别历时语料库中语言项目建立和过时时期的算法

IF 0.5

Corpora Pub Date : 2021-08-19 DOI: 10.3366/cor.2021.0218

Evandro Cunha, S. Wichmann

引用次数: 1

Review: McEnery, Hardie and Younis (eds). 2019. Arabic Corpus Linguistics. Edinburgh: Edinburgh University Press 书评:McEnery, Hardie and Younis主编。2019. 阿拉伯语料库语言学。爱丁堡:爱丁堡大学出版社