{"title":"Legal cynicism in Men’s Rights discourses: Using corpus linguistics to investigate how distrust in the legal system excuses and perpetuates sexual violence against women","authors":"Kate Barber","doi":"10.1016/j.acorp.2025.100148","DOIUrl":"10.1016/j.acorp.2025.100148","url":null,"abstract":"<div><div>The term <em>legal cynicism</em> refers to a type of legal disengagement which is associated with a lack of internal commitment to follow legal rules and a failure to acknowledge legal authority, typically stemming from perceived ongoing injustices and rights deprivations. This perception of the criminal justice system enables individuals in extremist communities to rationalise criminal actions, leading to an increased propensity for violent behaviour. Effectively identifying content such as this within online discourses has been argued to be the initial step in mitigating this propensity for violence and corpus linguistic methods, employed as entry points into these discourses, offer effective tools to do such analysis.</div><div>Using a 122,000-word corpus of online discourses produced by Men’s Right’s Activists (MRAs) on blogs and the subreddit <em>r/MensRights</em>, quantitative and qualitative approaches are used in this corpus-assisted discourse analysis to determine how legal cynicism is indexed and generated. The ways in which the criminal justice systems in both the United States and United Kingdom are contextualised and reframed to embed legal cynicism in MRA discourses, and the evidential and legal processes highlighted as problematic by MRAs, are explored. The paper discusses the impact of this reframing of the criminal justice system on the potential for violence through conspiracy theories and legal disengagement. It concludes with suggestions for addressing legal cynicism through prebunking and educational strategies designed to challenge misconceptions of criminal justice processes.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100148"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144988699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lexical epistemic markers in Ghanaian parliamentary discourse: A corpus-based diachronic analysis (2005–2024)","authors":"Emmanuel Mensah Bonsu","doi":"10.1016/j.acorp.2025.100161","DOIUrl":"10.1016/j.acorp.2025.100161","url":null,"abstract":"<div><div>Despite growing scholarly attention to parliamentary communication in established democracies, African legislative contexts remain underexplored. This study, therefore, examined lexical epistemic modality markers in Ghanaian parliamentary discourse using a corpus-based diachronic analysis (2005–2024). The corpus comprised 1,729 parliamentary Hansards (41.7 million words), processed with Python 3.x and AntConc. Analysis revealed that cognitive verbs dominated epistemic expression. Diachronic analysis found statistically significant changes across consecutive electoral period. Standardised residual analysis showed redistribution from personalised cognitive claims toward markers framing propositions as objective assessments. The findings provide the first diachronic quantitative results for epistemic modality in Ghanaian and wider West African parliamentary discourse. The results suggest potential applications for parliamentary communication training.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100161"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corpus linguistics for safeguarding children online","authors":"Mark McGlashan , Charlotte-Rose Kennedy","doi":"10.1016/j.acorp.2025.100149","DOIUrl":"10.1016/j.acorp.2025.100149","url":null,"abstract":"<div><div>Safeguarding children in schools broadly refers to the actions taken to protect children from abuse, prevent damage to health and development, and promote conditions that would improve the life chances of children. To safeguard children, UK schools must implement filtering and monitoring software to “block harmful and inappropriate content without unreasonably impacting teaching and learning” (Department for Education, 2024: 40). The industry standard method for monitoring online language use in schools is ‘keyword monitoring’, which identifies the use or presence of specific words or phrases (e.g. ‘bomb’) that correlate with a specific form of risk (e.g. violence). However, this approach typically depends on lists of words isolated from their context(s) of use and tends only to raise concerns if there is a direct match to a ‘keyword’. This can lead to ‘false positives’ whereby a 'keyword' match raises an automatic safeguarding concern (e.g. ‘bomb’) even if the use of the keyword was innocuous (e.g. ‘bath bomb’). This paper introduces corpus linguistics as a set of methods and approaches to enhance the effectiveness of filtering and monitoring through a case study based on a 1094,914-word corpus of online testimonies relating to suicide. In doing so, we demonstrate how corpus methods and analysis of authentic language data can be used to identify and contextualise safeguarding concerns. The practical applications of this research are intended to help schools to better protect children from the illegal and legal (but harmful) online materials that currently pose a threat to their safety and wellbeing.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100149"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144925094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corporate buzzword or genuine commitment? A corpus-assisted analysis of corporate ‘net-zero’ pledges by major global corporations","authors":"Matteo Fuoli, Annika Beelitz","doi":"10.1016/j.acorp.2025.100142","DOIUrl":"10.1016/j.acorp.2025.100142","url":null,"abstract":"<div><div>In recent years, corporations have faced growing pressure to address their environmental impact, leading many to pledge ‘net-zero’ emissions. This study employs corpus-assisted discourse analysis to examine how Fortune Global 500 companies communicate their net-zero commitments in their sustainability disclosures. Specifically, we conduct frequency, collocate, and concordance analyses to examine how the term <em>net zero</em> is discursively constructed and the solutions proposed to achieve this goal. Our findings support media observations that net zero has rapidly become a central theme in corporate discourse. However, corporate disclosures often frame net zero as a “journey” or an “ambition” and place a stronger focus on setting targets over concrete strategies to reduce emissions. These results raise questions about how credible corporate net-zero commitments are.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100142"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144885853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"iThink, therefore iCheck: Critical engagement with ChatGPT in linguistic analysis and learning","authors":"Pierfranca Forchini, Amanda C. Murphy","doi":"10.1016/j.acorp.2025.100169","DOIUrl":"10.1016/j.acorp.2025.100169","url":null,"abstract":"<div><div>This study explores the integration of the free version of ChatGPT 4.0 into a graduate class, focusing on the tool’s ability to perform linguistic analysis and on students’ engagement with it through inductive learning. To address concerns about students’ uncritical use of ChatGPT, the research compares textual analyses of dialogs from two movies (drawn from the American Movie Corpus - (<span><span>https://americanmoviecorpus.net</span><svg><path></path></svg></span>) carried out by the instructors and by ChatGPT. Adopting a quasi-experimental design, it examines how two groups of graduate students of English – one trained in linguistic analysis and the multidimensional analysis (MDA) framework, the other untrained – interacted with ChatGPT’s analyses and evaluated both the tool and the learning experience.</div><div>Both instructors and student groups used structured prompts to generate general textual and MDA-based analyses via ChatGPT. The instructors’ output (produced and collected at the same time as the students’ output) were analyzed to assess the tool’s performance, while the reflections on the experiment by the students served to evaluate the impact of prior training on their critical engagement.</div><div>The findings reveal that ChatGPT’s ability to perform both general and MDA-based analyses was limited, often inconsistent and inaccurate. Students with prior MDA training showed stronger data literacy and more critical engagement with the tool, while untrained students exhibited overreliance and misconceptions regarding ChatGPT’s capabilities. These results highlight the need for targeted instruction to foster analytical skills and reduce uncritical AI use.</div><div>This study contributes to ongoing debates on AI in education by underscoring the value of instructor guidance and structured training. It supports a pedagogical approach where AI is critically integrated into academic settings, encouraging informed and responsible student engagement.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100169"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meike de Boer , Willemijn Heeren , Anton Daser , Colm Gannon , Frederic Gnielka , Salla Huikuri , Robert Lehmann , Rebecca Reichel , Thomas Schäfer , Alexander F. Schmidt , Katarzyna Staciwa , Arjan Blokland
{"title":"Lexical choices of sharers and non-sharers on child sexual abuse material forums","authors":"Meike de Boer , Willemijn Heeren , Anton Daser , Colm Gannon , Frederic Gnielka , Salla Huikuri , Robert Lehmann , Rebecca Reichel , Thomas Schäfer , Alexander F. Schmidt , Katarzyna Staciwa , Arjan Blokland","doi":"10.1016/j.acorp.2025.100157","DOIUrl":"10.1016/j.acorp.2025.100157","url":null,"abstract":"<div><div>On the dark web, there are forums dedicated to the distribution and discussion of child sexual abuse material (CSAM). Although exchanging material is one of the major purposes of such forums, only a small portion of the users share CSAM themselves. Using keyness analysis, we analyzed word frequencies to see which words were unusually frequent for either CSAM sharers or non-sharers. The language of non-sharing members shows more positivity and rapport-building, which could be a way to compensate for not being able to meet the expectation to contribute material to the forum. In addition, they use more sexually explicit language, potentially to prove that they are a genuine part of the community. Sharers, on the other hand, talk more about the forum and the world outside of the forum where their practices are considered illegal. Hence, many words that are typical for the sharing members are related to the law and law enforcement. Before members start sharing, their language use is situated between non-sharers and sharers. They use positive, rapport-building, and explicit language, although lesser pronounced than non-sharers, and they refer to the forum community but not yet to the world outside the forum. Findings can be used by law enforcement in covert operations, who might want to mimic strategies to compensate for not being able to share CSAM. In addition, the results show that keyness analysis could potentially aid in differentiating between different groups of users on dark web CSAM forums, which could help law enforcement to prioritize target members in large-scale CSAM forums.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100157"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145320232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discerning diachronic sinsign topic shifts: A case study of UK HIV news","authors":"Jiantao Zou, Xuri Tang","doi":"10.1016/j.acorp.2025.100163","DOIUrl":"10.1016/j.acorp.2025.100163","url":null,"abstract":"<div><div>The emerging triangulation approach in corpus-based critical discourse analysis—supra-lexical discursive component extraction in particular—faces the challenge of bridging macro-level analytical constructs (such as topics) with micro-level discursive realizations. This paper addresses this macro-micro divide in discerning sinsign topic shifts by proposing a framework that introduces unsupervised keyword extraction and word-embedding-based keyword clustering for topic shift identification and synthesizes collocation networks, sentiment analysis, and concordance reading to triangulate statistical topic shift patterns with fine-grained discursive realizations. The case study of UK HIV news discourse with the proposed framework identifies three diachronic shifts: the change from protection to prevention in HIV policy, destigmatization, and increasing focus on life quality of people living with HIV, all validated through macro-micro triangulation.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100163"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145578580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nan Xue PhD (student) , Jimin Wang PhD (Professor)
{"title":"A Comparative Study of Graded Vocabulary Features in HSK Level 6 Listening Materials and Media Audio, and an Analysis of the Graded Word List","authors":"Nan Xue PhD (student) , Jimin Wang PhD (Professor)","doi":"10.1016/j.acorp.2025.100143","DOIUrl":"10.1016/j.acorp.2025.100143","url":null,"abstract":"<div><div>Vocabulary familiarity plays a critical role in Chinese language learners’ listening comprehension. This study compares HSK Level 6 listening materials (∼50,000 tokens) and transcribed media audio texts (∼100,000 tokens), using the graded word lists from the Standards for Chinese Language Proficiency in International Chinese Education. Applying Python and the Language Technology Platform (LTP) for segmentation and automated processing, the study calculates the proportions of vocabulary across levels. Results reveal no significant differences in graded word coverage between the two corpora, but both contain a substantial proportion of unclassified words, indicating limited coverage by current word lists. Frequency analysis also shows underuse of many listed words. These findings highlight the need to enhance graded word lists through corpus-based NLP techniques and suggest that topic type may influence vocabulary distribution in listening texts.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100143"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144842288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Raising genre awareness through visualizing language features","authors":"John Blake, Maxim Mozgovoy","doi":"10.1016/j.acorp.2025.100162","DOIUrl":"10.1016/j.acorp.2025.100162","url":null,"abstract":"<div><div>This paper introduces the Feature Visualizer, an open-access AI-powered tool designed to raise genre awareness among novice academic writers through inductive learning, a process that includes approaches such as discovery learning. The tool houses an annotated corpus of scientific research articles written by computer science majors and allows learners to explore authentic texts using on-demand visualizations and multimodal explanations. By engaging with the corpus, learners identify recurring language patterns and rhetorical structures at macro, meso, and micro levels, facilitating the bottom-up discovery of genre conventions. A longitudinal study with Japanese undergraduate computer science majors showed that the tool enhanced learners’ awareness of academic writing conventions and genre features. Focus group interviews further confirmed the usability and pedagogical value of the Feature Visualizer. We conclude by discussing practical applications for genre-based writing instruction informed by inductive learning principles.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100162"},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145578581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A corpus-assisted discourse analysis of epistemic stances in tweets about U.S. police from 2013 to 2023","authors":"Mark Winston Visonà , Şebnem Kurt","doi":"10.1016/j.acorp.2025.100138","DOIUrl":"10.1016/j.acorp.2025.100138","url":null,"abstract":"<div><div>Recently, public debate regarding law enforcement practices has extended into digital spaces, particularly on social media platforms such as Twitter (X). Prior research has focused on police-initiated communication both offline and online, yet few studies have explored how the public discusses policing on social media or whether this discussion has changed diachronically. The current study addresses these gaps via a corpus-assisted discourse analysis of a subset of tweets from four U.S. cities (Chicago, Houston, Los Angeles, and Washington, DC) posted between 2013 and 2023 containing the word ‘police’ and the epistemic marker ‘think/thought.’ By examining these tweets, the study analyzes how Twitter users position themselves or others on an epistemic gradient as more (K+) or less (K-) knowledgeable about specific aspects of policing. Using a mixed-methods approach that combines n-gram analysis with discourse analysis of stancetakers and tweet topics, this study identifies how key events shaped Twitter users’ attitudes towards U.S. policing practices over the last decade. Findings indicate that K+ tweets most frequently discussed police services, followed by crime/victims, with particular services like calling 911 and crimes involving vehicles debated by users. In K- tweets, users critiqued others’ knowledge of policing while police services remained the dominant topic with secondary topics like race varying more than in K+ tweets. This study thus contributes to our understanding of public perceptions of policing in online contexts and highlights epistemic stancetaking strategies used by Twitter users to involve others when discussing contentious issues related to law enforcement in the U.S.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100138"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}