{"title":"Conventionalized phrases and disability policy: A corpus analysis of 2-year and 4-year public colleges in California","authors":"Stephen Eyman","doi":"10.1016/j.acorp.2024.100113","DOIUrl":"10.1016/j.acorp.2024.100113","url":null,"abstract":"<div><div>This corpus-based study analyzes the use of conventionalized phrases in disability policy. Specifically, it focuses on the three phrases made common by the Americans with Disabilities Act: qualified individual with a disability, reasonable accommodations, and interactive process. These three phrases are analyzed in the context of disability policy at 2-year and 4-year public colleges in California. A corpus of disability policies was created for each of these contexts and analyzed to better understand the varied implementation of conventionalized phrases across contexts. The study finds that the three phrases from the ADA have been diffused across higher education disability policies in the corpora created and are highly conventionalized in these contexts. Additionally, these phrases can be used with slightly different valences depending on the context. These differences in use appear to be directly related to the relationship between the three phrases themselves and they mirror debates in disability policy such as that around the modal ‘may’ in relation to whether or not an institution implements an interactive process. Furthermore, institutional differences in the implementation of these phrases is potentially related to the stances institutions take towards disability and disability policy.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100113"},"PeriodicalIF":0.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142659056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The effects of teacher, peer and self-feedback on error correction with corpus use","authors":"Yoshiho Satake","doi":"10.1016/j.acorp.2024.100114","DOIUrl":"10.1016/j.acorp.2024.100114","url":null,"abstract":"<div><div>The strengths of corpora in language learning have been stated, while not many studies have explored the effects of feedback on error correction in the settings of data-driven learning (DDL), which is an approach where learners use corpora to learn language patterns inductively. Therefore, this study examines the effects of feedback on second language (L2) error correction with corpus use. The author hypothesizes that seeing many example sentences of the target word(s) with corpus use is useful in correcting L2 errors and that different sources of feedback have different effects on error correction. To test the hypotheses, the effects of teacher feedback on 55 participants’ error correction with use of the Corpus of Contemporary American English (COCA) were compared with those of peer feedback along with those of self-feedback. The results show that teacher feedback especially worked well for correcting omission errors and agreement errors. The strength of teacher feedback was identifying correctable errors. The results suggest that efficient corpus use for error correction requires teachers to consider appropriate combinations of feedback and error types (e.g., teacher feedback for omission errors and agreement errors).</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100114"},"PeriodicalIF":0.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating the visual content of a commercialized academic listening test: Implications for validity","authors":"Zhuohan Hou , Vahid Aryadoust , Azrifah Zakaria","doi":"10.1016/j.acorp.2024.100109","DOIUrl":"10.1016/j.acorp.2024.100109","url":null,"abstract":"<div><div>As incorporating visual modes in listening tests is gradually gaining traction in second language (L2) assessment, the inclusion of such visuals brings up questions about the role of visual modes in meaning-making during listening and test validity. In this study, we investigated the visual features of the International English Language Testing System (IELTS) listening test through the application of the social semiotic multimodal framework. Our corpus comprised 300 visuals from 256 academic listening testlets published between 1996 and 2022. Unlike the past studies of social semiotic multimodal analyses that relied on qualitative methods, our study adopted a series of visualization and quantitative statistical analysis of frequency and dispersion measures, using the general linear model to examine the visuals from a social semiotic multimodal perspective. The results revealed significant variation in the visual structures of the testlets. Through applying a post-hoc analysis, we further proposed recommendations for further research on multimodal materials in listening assessment and discussed the implications of the observed variation for the validity of the IELTS listening test. This study may be considered the first attempt to examine L2 listening assessment from a corpus-based social semiotic multimodal perspective, which may inspire more investigations on multimodal listening.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100109"},"PeriodicalIF":0.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142537362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corpus linguistics will benefit from greater adoption of pre-registration: A novice-friendly split-corpus approach to pre-registration","authors":"Matthew H.C. Mak","doi":"10.1016/j.acorp.2024.100111","DOIUrl":"10.1016/j.acorp.2024.100111","url":null,"abstract":"<div><div>In this brief article, I contend that the field of corpus linguistics stands to gain significantly from an increased adoption of pre-registration. Pre-registration serves to constrain the almost infinite degree of analytic freedom inherent in corpus analysis, thereby enhancing the transparency, reliability, and potential impact of corpus research. While pre-registration is increasingly popular in fields such as psychology and medicine, its uptake in corpus linguistics remains notably limited. To facilitate the transition toward pre-registration, I describe a straightforward split-corpus approach, ideally suited for corpus linguists new to pre-registration and for both hypothesis-testing and exploratory research. This method involves dividing a corpus into an exploratory set (20–40 % of the corpus) and a confirmatory set (the remaining 60–80 %). The exploratory set allows researchers to freely generate hypotheses and develop analysis plans, while the confirmatory set is then used for a more structured and objective analysis according to the pre-specified protocols. By employing this approach, corpus linguists can effectively balance exploratory flexibility with the rigour of confirmatory analysis, boosting the reliability of corpus findings. An increased uptake of pre-registration may not only bolster recognition of corpus linguistics as a robust empirical field, but it may also encourage a stronger emphasis on the building of cumulative knowledge.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100111"},"PeriodicalIF":0.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142445434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Breach of pacta sunt servanda: A corpus-assisted analysis of newspaper discourse on the AUKUS agreement","authors":"Radoslava Trnavac , Encarnacion Hidalgo Tenorio","doi":"10.1016/j.acorp.2024.100108","DOIUrl":"10.1016/j.acorp.2024.100108","url":null,"abstract":"<div><div>The AUKUS agreement,<span><span><sup>1</sup></span></span> a strategic pact between Australia, the United Kingdom, and the United States, primarily aimed to facilitate Australia's acquisition of eight nuclear-powered submarines from the US and Britain. This agreement led to the abrupt termination of a previous contract with France's state-owned Naval Group. This article examines the language used in media coverage of the AUKUS agreement in newspapers from various Anglophone and Asian countries. Employing a combination of Sentiment Analysis (Crossley et al., 2017) and Corpus-Assisted Discourse Studies (Partington, 2013; Gillings et al., 2023), we focus on identifying key linguistic patterns, themes, and the sentiment embedded in the discourse. Our findings indicate a general positive assessment of AUKUS in the Anglophone media, contrasted with negative portrayals in Chinese publications. Moreover, the analysis of linguistic components such as adjectives, nouns, and verbs reveals underlying complexities and conflicting viewpoints within the Anglophone discourse itself. By applying Corpus-Assisted Discourse Studies, we uncover the contextual and linguistic factors that shape these diverse perspectives.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100108"},"PeriodicalIF":0.0,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142437801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying ChatGPT-generated texts in EFL students’ writing: Through comparative analysis of linguistic fingerprints","authors":"Atsushi Mizumoto , Sachiko Yasuda , Yu Tamura","doi":"10.1016/j.acorp.2024.100106","DOIUrl":"10.1016/j.acorp.2024.100106","url":null,"abstract":"<div><div>The emergence of generative AI (GenAI) poses new challenges for L2 writing teachers. This study investigates the distinguishability of essays written by Japanese EFL learners from those generated by ChatGPT. Partially replicating Herbold et al. (2023), 140 first-year university students wrote essays and completed a survey on ChatGPT use. Among them, 125 wrote independently, 13 used ChatGPT for proofreading, and two asked ChatGPT to write the entire essay. To create a comparative dataset, 123 additional essays were generated by ChatGPT, imitating the two texts. The resulting 263 essays were then analyzed using the natural language processing (NLP) technique, including automated linguistic analysis and machine learning classification using random forest. The results reveal significant differences between human-written and ChatGPT-generated essays across all linguistic features, with the latter being easily identifiable. This study emphasizes the need for clear guidelines on the ethical use of AI in L2 writing, highlighting the potential risk of inappropriate AI use and the importance of fostering a mutual understanding of AI use with learners regarding responsible AI integration in academic work.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100106"},"PeriodicalIF":0.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142422071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"English podcasts for schoolchildren and their vocabulary demands","authors":"Emily Casaletto , Irina Kerimova , Ulugbek Nurmukhamedov","doi":"10.1016/j.acorp.2024.100107","DOIUrl":"10.1016/j.acorp.2024.100107","url":null,"abstract":"<div><div>This exploratory study examines the vocabulary demands of English children's podcasts. A 359,153-word podcast corpus was created using the written transcripts of episodes from these popular children's podcasts: <em>But Why, Circle Round, KidNuz, Smash Boom Best</em>, and <em>Wow in the World</em>. The corpus was analyzed to determine the vocabulary size necessary to know 95 % and 98 % of the words in the English children's podcasts. The results showed that a vocabulary size of the most 4,000-word families plus knowledge of proper nouns (PN), marginal words (MW), transparent compounds (TC) and acronyms (AC) provided 95.69 % coverage of the children's podcast corpus and a vocabulary size of 7,000-word families plus PN, MW, TC and AC reached 98.10 % coverage, indicating that podcasts designed for children require a larger vocabulary size compared to general-audience podcasts designed for adults.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100107"},"PeriodicalIF":0.0,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142327820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Moritz Sahlender , Stefanie Helbert , Inga ten Hagen , Anastasia Knaus , Zarah Weiss
{"title":"Investigating spoken classroom interactions in linguistically heterogeneous learning groups – An interdisciplinary approach to process video-based data in second language acquisition classrooms","authors":"Moritz Sahlender , Stefanie Helbert , Inga ten Hagen , Anastasia Knaus , Zarah Weiss","doi":"10.1016/j.acorp.2024.100104","DOIUrl":"10.1016/j.acorp.2024.100104","url":null,"abstract":"<div><div>Speaking the local language is central for successful integration into society. The teacher's language in second language (L2) classrooms serves as a crucial tool in language learning. Heterogeneity of learners’ language proficiency levels challenges teachers to adapt their language and accompanied instructional behavior. We offer an approach to study language acquisition processes and how teachers adapt their instructional language. This article presents our language-independent guidelines for processing video-based data of classroom interactions and demonstrate their reliability in a German as Second Language (GSL) classroom. These guidelines enable transcriptions of spoken language in noisy environments and detailed annotations of non-verbal classroom behavior. We outline research avenues at the intersection of empirical education research and linguistics that become feasible through these resources focusing on studying (non-)verbal adaptation strategies of teachers for learners at different proficiency levels. Our work directly fosters the interdisciplinary study of teacher-learner interactions, teacher competencies, and language acquisition.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100104"},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Capturing chronological variation in L2 speech through lexical measurements and regression analysis","authors":"Mariko Abe , Yuichiro Kobayashi , Yusuke Kondo","doi":"10.1016/j.acorp.2024.100105","DOIUrl":"10.1016/j.acorp.2024.100105","url":null,"abstract":"<div><p>This study aims to bridge gaps in current research by analyzing a longitudinal spoken learner corpus of low-proficiency English learners. We investigated the chronological variation in lexical measurements in second language (L2) speaking production, focusing on data from 104 low-proficiency learners elicited eight times over 23 months. Our findings show that measures such as the number of different words and type-token ratio are effective indicators of L2 speaking development, whereas the use of sophisticated vocabulary was not significantly correlated with learning duration. These results suggest that in the early stages of L2 acquisition, speaking skills are influenced primarily by lexical variation. This finding underscores the importance of lexical variation as a key factor in novice-level L2 speaking proficiency.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100105"},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000224/pdfft?md5=18e6b1567dc0d76abee155e9e4bd6910&pid=1-s2.0-S2666799124000224-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142270812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dawn Knight , Nouran Khallaf , Paul Rayson , Mahmoud El-Haj , Ignatius Ezeani , Steve Morris
{"title":"FreeTxt: A corpus-based bilingual free-text survey and questionnaire data analysis toolkit","authors":"Dawn Knight , Nouran Khallaf , Paul Rayson , Mahmoud El-Haj , Ignatius Ezeani , Steve Morris","doi":"10.1016/j.acorp.2024.100103","DOIUrl":"10.1016/j.acorp.2024.100103","url":null,"abstract":"<div><p>Qualitative free-text responses (e.g. from questionnaires and surveys) pose a challenge to many companies and institutions which lack the expertise to analyse such data with ease. While a range of sophisticated tools for the analysis of text <em>do</em> exist, these are often expensive, difficult to use and/or inaccessible to non-expert users. These tools also lack support for the analysis of English <em>and</em> Welsh text, which can be a particular challenge in the bilingual context of Wales. This paper details the key functionalities of the first corpus-based ‘FreeTxt’ toolkit which has been designed to support the systematic analysis and visualisation of free-text data, as a direct response to these two key needs. This paper demonstrates how, by working in partnership, software engineers, natural language processing (NLP) experts and corpus linguists can collaborate with end-users and beneficiaries to provide effective solutions to real world problems. Through the development of FreeTxt (<span><span>www.freetxt.app</span><svg><path></path></svg></span>), we aimed to empower end-users to <em>direct</em> and lead their own analyses of both small-scale and more extensive datasets to maximise the reach and potential impact generated. The approaches reported here, and the bilingual toolkit developed, can be replicated and extended for use in other language contexts and across a range of public and professional sectors. FreeTxt is now available for the analysis of Welsh and/or English, for use by <em>anyone</em> in <em>any sector</em> in Wales and beyond.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100103"},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799124000200/pdfft?md5=65f8a01d41b4150af967f22d4f542b8f&pid=1-s2.0-S2666799124000200-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}