{"title":"Toward a tool for evaluating corpus-based word lists for use in english language teaching contexts","authors":"Sarah Alzeer , Paul Thompson","doi":"10.1016/j.acorp.2024.100098","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100098","url":null,"abstract":"<div><p>With the proliferation of large corpora and the availability of sophisticated corpus-analysis tools, the number of corpus-based word lists targeting different types of vocabulary has rapidly increased during the last 20 years. This wide variety of lists has caused problems for practitioners, for whom it is not always easy to decide which list is most useful for their purpose and context. Given the paucity of systematic guidance on how to evaluate word lists, this study aimed to construct an evaluation tool that is based on Nation's (2016) framework of critiquing word lists, but is reformulated for a different purpose and for different target users, in order to increase the applicability of information derived from corpus analysis (the word lists). Constructed based on a thorough literature review, and informed by practitioners’ views and uses of word lists, along with consultations with ELT practitioners and word list experts, the tool targets ELT practitioners such as teachers, curriculum and assessment coordinators, and materials developers involved in directing vocabulary acquisition. The tool caters to practitioners with different levels of expertise and knowledge—especially those who are unfamiliar with the intricacies of developing corpus-based word lists. This paper documents the development of the initial version of the evaluation tool, as well as its first iteration, drawing upon the insights of both word list experts and practitioners in ELT.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141483723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of verb argument constructions (VACs) in L2 learners across proficiency levels: A corpus-based study in L1 Indonesian","authors":"Febriana Lestari","doi":"10.1016/j.acorp.2024.100097","DOIUrl":"10.1016/j.acorp.2024.100097","url":null,"abstract":"<div><p>This study investigated the constructional knowledge development of L1 Indonesian by examining nineteen Verb-Argument Constructions (VACs). The VACs examined in the present study are a verb pattern, followed by a preposition and a noun phrase, for example, “V <em>about</em> N” as in “He <u>talked</u> <em>about</em> <u>the progress</u>”. This study used the Indonesian subset of the Education First Cambridge Open Language Database (EFCAMDAT) corpus from beginner to advanced levels (CEFR A1 to C1; Council of Europe, 2001). This dataset comprises 2943 writing texts (224,763 words) from 623 learners. Frequency analysis of types and tokens was conducted to examine the distribution of the 19 VACs in learner writings across levels. Growth analyses were conducted to investigate the verbs that learners most frequently associated with the most productive VACs. Correlational analyses were conducted to explore how closely related the verb-VAC associations between proficiency levels and the verb occupants in the associations. The results indicate that learners’ constructional knowledge development was implied by: (1) the frequency increase in types and tokens of VACs from lower to higher proficiency levels, (2) the variety of verbs associated with VACs, and (3) the construction schematicity increase indicated by the use of general to more specific verb productions distinct to proficiency levels. The results suggest that English language learners need more exposure to lexicogrammatical features to facilitate VACs acquisition and usage.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141403911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Argentina to Zimbabwe: Exploring the global appeal of the International Baccalaureate","authors":"Saira Fitzgerald","doi":"10.1016/j.acorp.2024.100096","DOIUrl":"10.1016/j.acorp.2024.100096","url":null,"abstract":"<div><p>This paper presents the third stage of a larger research project examining perceptions of the International Baccalaureate (IB) to better understand its growing influence on education systems around the world. The first two stages involved a synchronic and diachronic analysis of IB discourse in a 27 million word specialized corpus of global press articles, created as an unsolicited window into public opinion (Mautner, 2008). The present study uses the same corpus to explore how the IB is represented in different countries, what values and attitudes may be associated with it, and how it interacts with other global education actors. Bottom up and top down methods from corpus-assisted discourse studies (CADS) were used to analyze 34,104 newspaper articles from 56 countries. Frequency, collocation and concordance analyses revealed four dominant discourses of deficiency connected to national education systems in countries across the ideological spectrum that helped to legitimize the inclusion of private actors in the provision of education. Results also showed unique discourses associated with the IB in North America, thereby highlighting the key role that this region plays in the IB world.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000133/pdfft?md5=2c0d3c0ae023763fdea6e64970af8fc6&pid=1-s2.0-S2666799124000133-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141405145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attitudes, communicative functions, and lexicogrammatical features of anti-vaccine discourse on Telegram","authors":"Souad Boumechaal , Serge Sharoff","doi":"10.1016/j.acorp.2024.100095","DOIUrl":"10.1016/j.acorp.2024.100095","url":null,"abstract":"<div><p>This paper reports the process of collecting a corpus with examples of anti-vaccine discourse and the results of its linguistic analysis. The overall aim of the project is to help public health authorities to improve their communication campaigns by better understanding the conditions for misinformation spreading via social media. More specifically, this paper analyses linguistic properties of a corpus of prominent misinformation channels in Telegram as compared against a more general COVID corpus as well as against a general purpose English corpus. For this paper, the quantitative analysis relies on corpus querying to identify the most recurrent discourse patterns related to COVID vaccines. We use the appraisal framework to analyse the patterns with respect to the attitudes conveyed in the messages. We have also applied an automatic AI classifier to predict communicative functions of these texts. This allows us to examine them more closely through the use of simple lexicogrammatical features following Biber, as well as their ideational processes following Halliday. The findings show that common collocations in the Telegram corpus containing misinformation draw on three attitudes: fear, insecurity, and mistrust in COVID vaccines which are discursively constructed to promote vaccine hesitancy among social media users. Furthermore, the misinformation messages tend to occur more often in such communicative functions as promotional texts, news reporting, and text expressed as presenting reference information.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141031247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wash your hands: CDC, WHO, and NHS tweets in the #COVID19 pandemic","authors":"Katherine A Ireland","doi":"10.1016/j.acorp.2024.100094","DOIUrl":"10.1016/j.acorp.2024.100094","url":null,"abstract":"<div><p>This work tracks public health messaging and evidence of stability and change in corpora of the Centers for Disease Control and Prevention (CDC), World Health Organization (WHO), and National Health Service (NHS) official account tweets throughout 2020. Using corpus-based methods, including keyword analysis, major similarities and differences are identified across tweets by each organization over time. Larger macro-level and micro-level discourses and linguistic patterns are revealed, with specific applications relevant to public health and governmental messaging, especially regarding risk and health communication. Findings include the NHS providing the most comprehensive and varied messaging out of each organization, including references to recommended actions, communities and individuals, and information. The WHO focuses predominantly on cases and region-specific information, while the CDC includes a variety of information, with a US-internal focus. Applications include further recommendations for public health communication, including the necessity of diverse linguistic patterns and interactive messaging tactics for governmental organizations.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141054741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying corpus linguistics to the law","authors":"Jesse Egbert , Ute Römer-Barron","doi":"10.1016/j.acorp.2024.100093","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100093","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140620603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Representations of obesity in Australian and UK news coverage: A diachronic comparison","authors":"Luke C. Collins , Paul Baker , Gavin Brookes","doi":"10.1016/j.acorp.2024.100092","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100092","url":null,"abstract":"<div><p>In both Australia and the UK, the number of adults living with obesity has been increasing over the last 30 years (AIHW, 2023; Baker, 2023). Although policy has emphasised ‘community-based interventions’ in Australia (AIHW, 2017) and ‘system-wide approaches’ in the UK (Ulijaszek and McLennan, 2016) for overcoming the challenges of obesity, previous research has shown that media representations have been dominated by representations promoting individual responsibility (e.g., Kim & Willis, 2007). In this paper, we report our observations of representations documented in corpora of media coverage from Australia and the UK between 2008-2017. The corpora amount to 16.4 million tokens and 36 million tokens, respectively. We identify key semantic domains for each year of the corpora and discuss both consistent and shifting themes in the data. Our findings show that the Australian coverage provides a more sustained focus on responses to obesity at the societal level, referring to practices in the food industry and differences between communities that can lead to health disparities. By comparison, while there is an increase in the amount of coverage in the UK press referring to obesity, the content became more narrowly focussed on food consumption and weight loss over the study period. The findings demonstrate how media coverage contributes to public understanding of how to respond to the challenges of obesity.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000091/pdfft?md5=6e9ecc0d87ef63dc626b52509b233d53&pid=1-s2.0-S2666799124000091-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140180464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"‘Luxurious’ metaphors in luxury hotel websites in Singapore and Hong Kong: A mixed-methods study","authors":"Joanna Zhuoan Chen, Kathleen Ahrens, Dennis Tay","doi":"10.1016/j.acorp.2024.100090","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100090","url":null,"abstract":"<div><p>Previous research has yielded a substantial body of empirical evidence regarding the use of metaphors in various types of discourse. However, limited research exists on the relationship between metaphor and more segmented economic industries, such as the luxury hospitality sector. The attention of this article is directed towards inspecting how metaphorical expressions are deployed by luxury hotels to construct their luxury identity and attract potential guests.</p><p>A corpus of 62 lxury hotel websites from Singapore and Hong Kong is used as the contextual background for the investigation of metaphor usage in this study. Using MIPVU (Metaphor Identification Procedure VU University Amsterdam), a total of 6990 metaphorical keywords, including a diverse range of 28 source domains were observed. Among others, the five most productive source domains in the corpus are <span>living organism, physical object, space, artifact, and motion</span>. A mixed-methods approach that combines both quantitative data analytics and qualitative discourse analysis reveals and interprets significant associations between source domains, hotel facilities, and regions, suggesting that the choice of metaphorical expressions is not arbitrary but is influenced by specific factors related to the hotel's offerings and cultures. This study emphasises that the analysis of lexical-conceptual patterns in promotional texts can generate deeper insights into positioning strategies.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140191302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using early LLMs for corpus linguistics: Examining ChatGPT's potential and limitations","authors":"Satoru Uchida","doi":"10.1016/j.acorp.2024.100089","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100089","url":null,"abstract":"<div><p>This study evaluates the extent to which information can be obtained from early Large Language Models (LLMs) for corpus linguistic research. Various tasks were conducted using ChatGPT 3.5, such as generating word frequency lists, collocations, words that fit certain grammatical patterns, and identifying genres. The generations were then compared with the search results from a large-scale general corpus (COCA). While favorable results were not achieved in identifying the genres of words or paragraphs, there was notable congruence in the frequency lists (75.0 %), collocations (42.8 %), and grammatical patterns (53.0 %) for the top 20 items. Even when the generated items did not perfectly match those from COCA, it was evident that high-frequency items were produced. Although LLMs may not be sufficient for rigorous academic research, the results are adequate for discerning overall trends or assisting learners. In addition, the results of this study show that the ability to search at the phrase level is an advantage of using LLMs for corpus research.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000066/pdfft?md5=322cc8730f1db87e3aee8190477b04ed&pid=1-s2.0-S2666799124000066-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140000123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"here-, there-, and every where-: Exploring the role of pronominal adverbs in legal language","authors":"David Chandler, Brett Hashimoto","doi":"10.1016/j.acorp.2024.100087","DOIUrl":"https://doi.org/10.1016/j.acorp.2024.100087","url":null,"abstract":"<div><p>Many have claimed that pronominal adverbs, such as <em>hereby, thereafter,</em> and <em>wherein</em>, are a frequent, distinctive, and problematic in their use in legal language (<u>Tiersma, 1999</u>; <u>Mellinkoff, 2004</u>). The purpose of this study is to examine those claims empirically. In the present study, the prevalence of PAs in legal registers is compared to more general registers of contemporary American English to determine the extent to which these words are distinctly legal. The study will also explore why different types of PAs may be (in)frequent in specific legal registers to better understand their use. The frequency of PAs was extracted from corpora that are designed to represent six registers of English (3 legal; 4 non-legal). Rates of occurrence of PAs per text were then compared across registers using Kruskal-Wallis tests with Dunn post-hoc test with an eta<sup>2</sup> effect size. Subsequently, a functional analysis describing the uses of PAs was also conducted. The results indicate that PAs are highly restricted to legal registers because of functions that they serve. The types of functions that PAs perform within a text are discussed. A closer examination of the PAs considered both individually as well as grouped by locative adverb (i.e., <em>here-, there-</em>, and <em>where-</em>) indicates that some PAs are also more distinctive to certain legal registers for different reasons. This study opens the discussion as to the utility and necessity of PAs in legal language and provides suggestions for legal writers on how to use or remove PAs without inhibiting clarity or effectiveness.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000042/pdfft?md5=e07e56f679be7690beb03b867265ebaa&pid=1-s2.0-S2666799124000042-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139749266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}