Zsófia Demjén , Vaclav Brezina , Tara Coltman-Patel , William Dance , Richard Gleave , Claire Hardaker , Elena Semino
{"title":"‘I am still unsure…’ – Spontaneous expressions of vaccine indecision on Mumsnet","authors":"Zsófia Demjén , Vaclav Brezina , Tara Coltman-Patel , William Dance , Richard Gleave , Claire Hardaker , Elena Semino","doi":"10.1016/j.acorp.2025.100122","DOIUrl":"10.1016/j.acorp.2025.100122","url":null,"abstract":"<div><div>Vaccination programmes in 90 % of countries in the world have been affected by ‘vaccine hesitancy’. Childhood vaccinations are particularly important. Internationally, these vaccination rates have been declining, resulting in the resurfacing of communicable diseases previously considered eliminated. In this context, our paper examines parents’ unelicited expressions of vaccine indecision – dilemmas, hesitations and concerns related to vaccinating at the point of decision-making.</div><div>Our corpus-assisted discourse study combines discourse analysis and the qualitative and quantitative tools of corpus linguistics (text dispersion keywords and concordancing) to compare 422 Original Posts from the forum of the UK-based parenting website Mumsnet that outline vaccine indecision to vaccination discussions that do not involve decision-making difficulties. We examine what characterises authentic vaccine indecision in the localised context of Mumsnet users. Specifically, we analyse which vaccines Mumsnet users are undecided about; what concerns are linked to indecision specifically; and how such concerns are raised in a generally pro-vaccination online space. Our method combines the advantages of analysing large datasets with those of nuanced and localised qualitative analysis.</div><div>The vaccine that most consistently concerns parents is MMR. Concerns about other vaccines fluctuate with disease outbreaks and the introduction of new vaccines. The concerns linked to indecision specifically are mostly individual and vaccine-specific, and include the mode and timing of vaccinations, particular personal and family circumstances and a rather unspecific notion of side effects. Mumsnet users invite details about others' personal experiences to fill a need left by widely available general vaccine information. The implication is that health services may need to redirect resources from mass population level campaigns to more personalised and tailored approaches for parents who are hesitant about specific vaccines at particular points in time.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100122"},"PeriodicalIF":0.0,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143210206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How humans and machines identify discourse topics: A methodological triangulation","authors":"Mathew Gillings , Sylvia Jaworska","doi":"10.1016/j.acorp.2025.100121","DOIUrl":"10.1016/j.acorp.2025.100121","url":null,"abstract":"<div><div>Identifying and exploring discursive topics in texts is of interest to not only linguists, but to researchers working across the full breadth of the social sciences. This paper reports on an exploratory study assessing the influence that analytical method has on the identification and labelling of topics, which might lead to varying interpretations of texts. Using a corpus of corporate sustainability reports, totalling 98,277 words, we asked 6 different researchers to interrogate the corpus and decide on its main ‘topics’ via four different methods: LLM-assisted analyses; topic modelling; concordance analysis; and close reading. These methods differ according to the amount of data that can be analysed at once, the amount of textual context available to the researcher, and the focus of the analysis (i.e., micro to macro). The paper explores how the identified topics differed both between analysts using the same method, and between methods. We conclude with a series of tentative observations regarding the benefits and limitations of each method, and offer recommendations for researchers in choosing which analytical technique to select.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100121"},"PeriodicalIF":0.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anywhere but here: Discourses and representations surrounding same-sex marriage in Japanese newspapers","authors":"Keisuke Yoshimoto","doi":"10.1016/j.acorp.2025.100120","DOIUrl":"10.1016/j.acorp.2025.100120","url":null,"abstract":"<div><div>Although support for same-sex marriage has grown in Japan, discussions on its legalisation have been slow in the Japanese parliament. To contribute to a more meaningful discussion on this issue, this study uses corpus-driven and corpus-based methods to explore how issues surrounding same-sex marriage are represented in media discourse. It compares two corpora comprising articles from two Japanese national newspapers: the more conservative <em>Yomiuri Shimbun</em> (608,305 words) and liberal <em>Asahi Shimbun</em> (1,681,133 words). The data are from 1 April 2015, when Tokyo's Shibuya Ward started certifying same-sex couples, to 15 March 2024, the day after the Sapporo High Court ruled on same-sex marriage. Keywords, collocation, and concordance analysis are used to identify the differences in discourses and representations, exploring how their opinions on same-sex marriage are explicitly and implicitly delivered. The findings reveal that the <em>Yomiuri Shimbun</em> mostly depicts gay and lesbian people as fictional characters, or foreigners and argues against same-sex marriage in terms of child welfare. In contrast, <em>the Asahi Shimbun</em> considers the issues surrounding LGBTQ+ people to be related to human rights and criticises traditional heteropatriarchal family values as obstacles to advancing same-sex marriage movements and women's rights alike.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100120"},"PeriodicalIF":0.0,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Is LIWC reliable, efficient, and effective for the analysis of large online datasets in forensic and security contexts?","authors":"Madison Hunter, Tim Grant","doi":"10.1016/j.acorp.2025.100118","DOIUrl":"10.1016/j.acorp.2025.100118","url":null,"abstract":"<div><div>This article evaluates the reliability, efficiency, and effectiveness of Linguistic Inquiry and Word Count (LIWC; Boyd et al., 2022) for the analysis of a white nationalist forum. This is important because LIWC has been the computational tool of choice for scores of studies generally and many examining extremist content in a forensic or security context. Our purpose, therefore, is to understand whether LIWC can be depended upon for large-scale analyses; we initially examine this here using a small sample of posts from a set of just eight users and manually checking the program's automated codings of a subset of categories. Our results show that the LIWC coding cannot be relied upon – precision falls to as low as 49.6 % and recall as low as 41.7 % for some categories. It would be possible to engage in considerable manual correction of these results, but this undermines its purported efficiency for large datasets.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100118"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The personal_relationship frame in love fraud","authors":"Pamela Faber","doi":"10.1016/j.acorp.2025.100119","DOIUrl":"10.1016/j.acorp.2025.100119","url":null,"abstract":"<div><div>This research analyzed the love fraud event within the context of the <span>Personal_Relation</span> and the <span>Intentional_Deception</span> frame in FrameNet. Of the concepts that characterize this event, the focus was on <span>Relationship,</span> namely, its stages, participants, and dimensions. The data consisted of extended conversations between 83 scammers and the author, which were recorded from January 2021 to June 2024. When the corpus was analyzed on the SketchEngine platform, the collocates of <em>relationship</em> with the highest LogDice scores were identified and structured. The results show that fraudsters use scripts to construct a romantic relationship with victims, which begins with friendship, progresses to ‘soulmateship’ and engagement, and finally ends in an online ‘marriage’. This is accomplished through the strategic use and repetition of terms that belong to the <span>Personal_Relation</span> frame in FrameNet. The objective is to extract as much money as possible from the victim.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 1","pages":"Article 100119"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143159795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introductory editorial synthesis paper: Corpus linguistics and the language of COVID-19: Applications and outcomes","authors":"David Oakey , Benet Vincent","doi":"10.1016/j.acorp.2024.100110","DOIUrl":"10.1016/j.acorp.2024.100110","url":null,"abstract":"<div><div>This article provides an overview of the papers in the special issue of Applied Corpus Linguistics on “Corpus Linguistics and the Language of COVID-19: Applications and Outcomes”. As noted in our original call for contributions, we believe that, while traditional corpus linguistic work can reveal valuable insights into the emerging language around COVID-19, it should be complemented by more applied corpus linguistics work. The pandemic posed a real-world problem which applied corpus linguists were well equipped to address using linguistic evidence from a range of sources. This article presents an introduction to the papers in this special issue which will be of interest to applied corpus linguists due to the variety of perspectives they present in relation to a number of key issues of importance to the field: the data they draw on, the various theoretical frameworks which inform the research, the methods they use to collect and analyse the data, and the discussion of how their findings may be applicable to citizens, decision makers, consumers and other stakeholders in public and private contexts.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100110"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corrigendum to “here-, there-, and every where-: Exploring the role of pronominal adverbs in legal language” [Applied Corpus Linguistics Volume 4, Issue 1 (2024) 100087]","authors":"David Chandler, Brett Hashimoto","doi":"10.1016/j.acorp.2024.100112","DOIUrl":"10.1016/j.acorp.2024.100112","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100112"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lexical complexity in academic lectures: Comparative analysis of EMI and Non-EMI settings and influential factors","authors":"Chen Chen , Philip Durrant","doi":"10.1016/j.acorp.2024.100115","DOIUrl":"10.1016/j.acorp.2024.100115","url":null,"abstract":"<div><div>Despite the substantial body of research on vocabulary in English Medium Instruction (EMI), there is a noticeable dearth of corpus-based studies examining lexical complexity of EMI lectures, particularly in specific disciplines. To fill this gap, this study developed an EMI spoken academic corpus in Business (EMIB) with 120 lectures collected from 54 lecturers with nine different first languages (L1), reaching 1.12 million tokens. The study compared the lexical complexity of EMI Business lectures in China with academic lectures in Anglophone and non-Anglophone settings, represented by teachers’ speech in the British Academic Spoken English Corpus (BASE) and the Corpus of English as a Lingua Franca in Academic Settings (ELFA), respectively. Lexical complexity was conceptualised by lexical sophistication (operationalised by lexical frequency profile and mean frequency band score) and lexical diversity (operationalised by the VOCD-D). Results show that ELFA has significantly higher lexical sophistication than BASE, and significantly lower lexical diversity than BASE and EMIB. This study further explored whether speaker L1, speaker gender, and discipline contributed to the lexical complexity of lectures using multiple linear regression with interaction terms. Results show that speaker L1 and discipline significantly impacted the lexical complexity of lectures. Pedagogical implications are discussed.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100115"},"PeriodicalIF":0.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Getting into bed with embeddings? A comparison of collocations and word embeddings for corpus-assisted discourse analysis","authors":"Jordan Batchelor","doi":"10.1016/j.acorp.2024.100117","DOIUrl":"10.1016/j.acorp.2024.100117","url":null,"abstract":"<div><div>This paper discusses two approaches for identifying lexical patterns in discourse, namely the corpus linguistic method of collocation analysis and the natural language processing method of word embeddings. While both approaches can identify lexical patterns, they approach the task with different underlying frameworks, and the extent to which their results resemble one another has not been directly compared. This study uses two corpora, five collocation measures, and two word embedding algorithms to generate such comparisons. Results generally support the notion that many word pairs with similar embeddings are collocates, and that, to a lesser extent, many collocates have similar word embeddings. However, a major difference is that word pairs with similar embeddings do not need to co-occur often, or at all. Moreover, systematic differences in the kinds of words highlighted between the two word embedding algorithms were found and are discussed.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100117"},"PeriodicalIF":0.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142659058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining in-service senior high school English teachers’ perspectives on corpus use and the effects of corpus training","authors":"Hsiao-Ling Hsu , Shu-Li Lai , Hao-Jan Howard Chen","doi":"10.1016/j.acorp.2024.100116","DOIUrl":"10.1016/j.acorp.2024.100116","url":null,"abstract":"<div><div>Given the benefits of incorporating corpora into language learning, particularly in developing students’ abilities to observe and analyze language data, this study investigated Taiwanese in-service senior high school English teachers’ corpus literacy, their application of corpus tools in teaching, and the effects of an online corpus workshop. Conducted in two stages, the first involved collecting 151 teachers’ perceptions of corpus literacy and its applications from 141 schools across Taiwan. The second stage invited teachers across Taiwan to participate in an online corpus workshop, where corpus-based teaching and two tools (SKELL and Sketch Engine) were introduced, along with hands-on activities. Following the workshop, the participants completed a post-survey. The analysis of the pre-survey responses revealed a positive attitude toward but limited understanding of corpus use among teachers before attending the workshop. The Wilcoxon Signed Rank test, used to analyze the pre- and post-survey responses, showed significant improvements in the teachers’ corpus literacy and application skills after the workshop. The findings of this study offer valuable insights into corpus use among in-service teachers in various contexts. Future research should explore the further integration of corpus tools into classrooms and include in-depth interviews for more comprehensive insights.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 3","pages":"Article 100116"},"PeriodicalIF":0.0,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142659057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}