{"title":"Exploring the barriers to data-driven learning in the classroom: a systematic qualitative synthesis","authors":"Amelie Xiaohan Sun, Atsushi Mizumoto","doi":"10.1016/j.acorp.2025.100126","DOIUrl":"10.1016/j.acorp.2025.100126","url":null,"abstract":"<div><div>In the current study, we conducted a systematic qualitative synthesis of the literature on data-driven learning (DDL) to identify the potential difficulties, barriers, challenges, and obstacles associated with the implementation of DDL in the classroom. We found and gathered a total of 347 primary studies, including individual research articles or reports, published between 2012 and 2021. After a rigorous screening process based on four criteria, 94 target articles were selected and analyzed. These criteria were: (1) inclusion of empirical studies on DDL published between 2012 and 2021, (2) implementation of the DDL approach by instructors for English language learners, (3) utilization of research designs capturing qualitative data on learner and teacher perceptions of DDL, and (4) discussion of challenges and obstacles related to DDL from both learners' and teachers' perspectives. This screening resulted in 295 data segments, which are specific excerpts or portions within these target articles, each referring to one participant's experience and articulated challenges. These data segments were subsequently extracted and coded. The results suggest that the barriers to DDL can be categorized into two core groups: inherent and external factors. Inherent factors relate to the actual tasks involved in DDL, while external factors are connected to the users of this approach. Both categories may impact DDL's effectiveness. The findings imply that situational context and the emotional state of users should also be considered when designing tools, materials, and teaching methodologies for DDL. These insights have implications for both practitioners and researchers and could inform the development of training programs and materials aimed at enhancing the effectiveness of DDL in language learning and teaching.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100126"},"PeriodicalIF":0.0,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stylistic nuances through syntactic complexity: A corpus-assisted study of narration and dialogue in two english translations of Hongloumeng","authors":"Yifeng Sun , Kanglong Liu","doi":"10.1016/j.acorp.2025.100125","DOIUrl":"10.1016/j.acorp.2025.100125","url":null,"abstract":"<div><div><em>Hongloumeng</em> (also known as <em>The Story of the Stone</em> or <em>A Dream of Red Mansions</em>), a significant work in Chinese literature, along with its various English translations has been the subject of substantial scholarly attention. Among these, the two translated versions by David Hawkes and John Minford, and Xianyi Yang and Gladys Yang, have garnered much academic interest and sparked extensive discussions. However, there remains a significant void in the thorough analysis of syntactic complexity, a crucial aspect of their respective distinct translation styles. This study aims to address this gap by conducting a meticulous examination of the syntactic complexity in the first 80 chapters of the novel, as translated by Hawkes and the Yangs, with a specific focus on the subgenres of narration and dialogue. The analysis reveals substantial disparities, such as Hawkes employing longer linguistic units in narration and a higher frequency of subordinations in dialogue. By emphasizing the importance of syntactic complexity within the realm of translation style, this study advocates for integrating metrics that assess syntactic complexity in future explorations related to translation styles. The implications of these findings for enhancing translation research and pedagogy are also discussed.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 2","pages":"Article 100125"},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143580102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adolescent reading experience, independent choices and curriculum materials","authors":"Beverley Jennings , Daisy Powell , Sylvia Jaworska , Holly Joseph","doi":"10.1016/j.acorp.2025.100124","DOIUrl":"10.1016/j.acorp.2025.100124","url":null,"abstract":"<div><div>Reading comprehension ability is assessed in England within the English language GCSE exam. This is a high stakes exam, taken by all 16-year-olds, and a pass grade is needed to progress onto the next stage of education and employment. Since reading experience is an important predictor of reading comprehension ability, two different types of reading materials were explored to see how well they matched the reading required in the exam: 1) curriculum reading; and 2) independent reading. Two corpora of texts representing the two types of reading were created and explored using the methods of Corpus Linguistics. The curriculum reading corpus (CRC) had lower linguistic diversity, and higher frequency of nouns but lower frequency of adverbs, than the independent reading corpus (IRC). Exploratory analysis of the most frequent parts of speech revealed that the CRC had words that were more abstract and conceptual, whereas the IRC featured words about the concrete and the everyday, suggesting that curriculum reading presents a different type of vocabulary challenge. The CRC was not as close a match to the exam texts as the IRC. As the English language GCSE exam is used as a measure of literacy competency for both future study and future employment, this suggests that the types of texts chosen for the exam are not a good match for this purpose. The choice of texts in assessments therefore needs careful consideration.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100124"},"PeriodicalIF":0.0,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The unit of analysis in learner corpus research on formulaic language","authors":"Joe Geluso , Hui-Hsien Feng , Randy Appel","doi":"10.1016/j.acorp.2025.100123","DOIUrl":"10.1016/j.acorp.2025.100123","url":null,"abstract":"<div><div>This study employs two case studies to investigate how differences in the unit of analysis in learner corpus research (LCR) studies on formulaic language (e.g., lexical bundles and phrase frames) have the potential to lead researchers to disparate inferences even when analyzing the same corpora. LCR studies on written formulaic language (FL) commonly use the corpus as the unit of analysis, or a per-corpus approach, for inter-group comparisons. This approach combines essays from different individuals into a single long essay that represents the entire group. Less frequently, LCR studies on FL use the individual texts that comprise a corpus as the unit of analysis, or a per-text approach. A per-text approach allows the researcher to generate group means and standard deviations, or ranked frequencies at the text level. Findings suggest that the two research designs can lead to different results and hence conflicting inferences from the same data set. Specifically, a per-text approach appears less prone to identify significant differences between groups than a per-corpus approach, and better reflects similarities between groups such as the absence of linguistic features. We conclude with instructions on how to generate per-text counts using a popular and free corpus analysis tool.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100123"},"PeriodicalIF":0.0,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zsófia Demjén , Vaclav Brezina , Tara Coltman-Patel , William Dance , Richard Gleave , Claire Hardaker , Elena Semino
{"title":"‘I am still unsure…’ – Spontaneous expressions of vaccine indecision on Mumsnet","authors":"Zsófia Demjén , Vaclav Brezina , Tara Coltman-Patel , William Dance , Richard Gleave , Claire Hardaker , Elena Semino","doi":"10.1016/j.acorp.2025.100122","DOIUrl":"10.1016/j.acorp.2025.100122","url":null,"abstract":"<div><div>Vaccination programmes in 90 % of countries in the world have been affected by ‘vaccine hesitancy’. Childhood vaccinations are particularly important. Internationally, these vaccination rates have been declining, resulting in the resurfacing of communicable diseases previously considered eliminated. In this context, our paper examines parents’ unelicited expressions of vaccine indecision – dilemmas, hesitations and concerns related to vaccinating at the point of decision-making.</div><div>Our corpus-assisted discourse study combines discourse analysis and the qualitative and quantitative tools of corpus linguistics (text dispersion keywords and concordancing) to compare 422 Original Posts from the forum of the UK-based parenting website Mumsnet that outline vaccine indecision to vaccination discussions that do not involve decision-making difficulties. We examine what characterises authentic vaccine indecision in the localised context of Mumsnet users. Specifically, we analyse which vaccines Mumsnet users are undecided about; what concerns are linked to indecision specifically; and how such concerns are raised in a generally pro-vaccination online space. Our method combines the advantages of analysing large datasets with those of nuanced and localised qualitative analysis.</div><div>The vaccine that most consistently concerns parents is MMR. Concerns about other vaccines fluctuate with disease outbreaks and the introduction of new vaccines. The concerns linked to indecision specifically are mostly individual and vaccine-specific, and include the mode and timing of vaccinations, particular personal and family circumstances and a rather unspecific notion of side effects. Mumsnet users invite details about others' personal experiences to fill a need left by widely available general vaccine information. The implication is that health services may need to redirect resources from mass population level campaigns to more personalised and tailored approaches for parents who are hesitant about specific vaccines at particular points in time.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100122"},"PeriodicalIF":0.0,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143210206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How humans and machines identify discourse topics: A methodological triangulation","authors":"Mathew Gillings , Sylvia Jaworska","doi":"10.1016/j.acorp.2025.100121","DOIUrl":"10.1016/j.acorp.2025.100121","url":null,"abstract":"<div><div>Identifying and exploring discursive topics in texts is of interest to not only linguists, but to researchers working across the full breadth of the social sciences. This paper reports on an exploratory study assessing the influence that analytical method has on the identification and labelling of topics, which might lead to varying interpretations of texts. Using a corpus of corporate sustainability reports, totalling 98,277 words, we asked 6 different researchers to interrogate the corpus and decide on its main ‘topics’ via four different methods: LLM-assisted analyses; topic modelling; concordance analysis; and close reading. These methods differ according to the amount of data that can be analysed at once, the amount of textual context available to the researcher, and the focus of the analysis (i.e., micro to macro). The paper explores how the identified topics differed both between analysts using the same method, and between methods. We conclude with a series of tentative observations regarding the benefits and limitations of each method, and offer recommendations for researchers in choosing which analytical technique to select.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100121"},"PeriodicalIF":0.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anywhere but here: Discourses and representations surrounding same-sex marriage in Japanese newspapers","authors":"Keisuke Yoshimoto","doi":"10.1016/j.acorp.2025.100120","DOIUrl":"10.1016/j.acorp.2025.100120","url":null,"abstract":"<div><div>Although support for same-sex marriage has grown in Japan, discussions on its legalisation have been slow in the Japanese parliament. To contribute to a more meaningful discussion on this issue, this study uses corpus-driven and corpus-based methods to explore how issues surrounding same-sex marriage are represented in media discourse. It compares two corpora comprising articles from two Japanese national newspapers: the more conservative <em>Yomiuri Shimbun</em> (608,305 words) and liberal <em>Asahi Shimbun</em> (1,681,133 words). The data are from 1 April 2015, when Tokyo's Shibuya Ward started certifying same-sex couples, to 15 March 2024, the day after the Sapporo High Court ruled on same-sex marriage. Keywords, collocation, and concordance analysis are used to identify the differences in discourses and representations, exploring how their opinions on same-sex marriage are explicitly and implicitly delivered. The findings reveal that the <em>Yomiuri Shimbun</em> mostly depicts gay and lesbian people as fictional characters, or foreigners and argues against same-sex marriage in terms of child welfare. In contrast, <em>the Asahi Shimbun</em> considers the issues surrounding LGBTQ+ people to be related to human rights and criticises traditional heteropatriarchal family values as obstacles to advancing same-sex marriage movements and women's rights alike.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100120"},"PeriodicalIF":0.0,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Is LIWC reliable, efficient, and effective for the analysis of large online datasets in forensic and security contexts?","authors":"Madison Hunter, Tim Grant","doi":"10.1016/j.acorp.2025.100118","DOIUrl":"10.1016/j.acorp.2025.100118","url":null,"abstract":"<div><div>This article evaluates the reliability, efficiency, and effectiveness of Linguistic Inquiry and Word Count (LIWC; Boyd et al., 2022) for the analysis of a white nationalist forum. This is important because LIWC has been the computational tool of choice for scores of studies generally and many examining extremist content in a forensic or security context. Our purpose, therefore, is to understand whether LIWC can be depended upon for large-scale analyses; we initially examine this here using a small sample of posts from a set of just eight users and manually checking the program's automated codings of a subset of categories. Our results show that the LIWC coding cannot be relied upon – precision falls to as low as 49.6 % and recall as low as 41.7 % for some categories. It would be possible to engage in considerable manual correction of these results, but this undermines its purported efficiency for large datasets.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100118"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The personal_relationship frame in love fraud","authors":"Pamela Faber","doi":"10.1016/j.acorp.2025.100119","DOIUrl":"10.1016/j.acorp.2025.100119","url":null,"abstract":"<div><div>This research analyzed the love fraud event within the context of the <span>Personal_Relation</span> and the <span>Intentional_Deception</span> frame in FrameNet. Of the concepts that characterize this event, the focus was on <span>Relationship,</span> namely, its stages, participants, and dimensions. The data consisted of extended conversations between 83 scammers and the author, which were recorded from January 2021 to June 2024. When the corpus was analyzed on the SketchEngine platform, the collocates of <em>relationship</em> with the highest LogDice scores were identified and structured. The results show that fraudsters use scripts to construct a romantic relationship with victims, which begins with friendship, progresses to ‘soulmateship’ and engagement, and finally ends in an online ‘marriage’. This is accomplished through the strategic use and repetition of terms that belong to the <span>Personal_Relation</span> frame in FrameNet. The objective is to extract as much money as possible from the victim.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100119"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introductory editorial synthesis paper: Corpus linguistics and the language of COVID-19: Applications and outcomes","authors":"David Oakey , Benet Vincent","doi":"10.1016/j.acorp.2024.100110","DOIUrl":"10.1016/j.acorp.2024.100110","url":null,"abstract":"<div><div>This article provides an overview of the papers in the special issue of Applied Corpus Linguistics on “Corpus Linguistics and the Language of COVID-19: Applications and Outcomes”. As noted in our original call for contributions, we believe that, while traditional corpus linguistic work can reveal valuable insights into the emerging language around COVID-19, it should be complemented by more applied corpus linguistics work. The pandemic posed a real-world problem which applied corpus linguists were well equipped to address using linguistic evidence from a range of sources. This article presents an introduction to the papers in this special issue which will be of interest to applied corpus linguists due to the variety of perspectives they present in relation to a number of key issues of importance to the field: the data they draw on, the various theoretical frameworks which inform the research, the methods they use to collect and analyse the data, and the discussion of how their findings may be applicable to citizens, decision makers, consumers and other stakeholders in public and private contexts.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100110"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}