{"title":"保护儿童网络安全的语料库语言学","authors":"Mark McGlashan , Charlotte-Rose Kennedy","doi":"10.1016/j.acorp.2025.100149","DOIUrl":null,"url":null,"abstract":"<div><div>Safeguarding children in schools broadly refers to the actions taken to protect children from abuse, prevent damage to health and development, and promote conditions that would improve the life chances of children. To safeguard children, UK schools must implement filtering and monitoring software to “block harmful and inappropriate content without unreasonably impacting teaching and learning” (Department for Education, 2024: 40). The industry standard method for monitoring online language use in schools is ‘keyword monitoring’, which identifies the use or presence of specific words or phrases (e.g. ‘bomb’) that correlate with a specific form of risk (e.g. violence). However, this approach typically depends on lists of words isolated from their context(s) of use and tends only to raise concerns if there is a direct match to a ‘keyword’. This can lead to ‘false positives’ whereby a 'keyword' match raises an automatic safeguarding concern (e.g. ‘bomb’) even if the use of the keyword was innocuous (e.g. ‘bath bomb’). This paper introduces corpus linguistics as a set of methods and approaches to enhance the effectiveness of filtering and monitoring through a case study based on a 1094,914-word corpus of online testimonies relating to suicide. In doing so, we demonstrate how corpus methods and analysis of authentic language data can be used to identify and contextualise safeguarding concerns. The practical applications of this research are intended to help schools to better protect children from the illegal and legal (but harmful) online materials that currently pose a threat to their safety and wellbeing.</div></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"5 3","pages":"Article 100149"},"PeriodicalIF":2.1000,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Corpus linguistics for safeguarding children online\",\"authors\":\"Mark McGlashan , Charlotte-Rose Kennedy\",\"doi\":\"10.1016/j.acorp.2025.100149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Safeguarding children in schools broadly refers to the actions taken to protect children from abuse, prevent damage to health and development, and promote conditions that would improve the life chances of children. To safeguard children, UK schools must implement filtering and monitoring software to “block harmful and inappropriate content without unreasonably impacting teaching and learning” (Department for Education, 2024: 40). The industry standard method for monitoring online language use in schools is ‘keyword monitoring’, which identifies the use or presence of specific words or phrases (e.g. ‘bomb’) that correlate with a specific form of risk (e.g. violence). However, this approach typically depends on lists of words isolated from their context(s) of use and tends only to raise concerns if there is a direct match to a ‘keyword’. This can lead to ‘false positives’ whereby a 'keyword' match raises an automatic safeguarding concern (e.g. ‘bomb’) even if the use of the keyword was innocuous (e.g. ‘bath bomb’). This paper introduces corpus linguistics as a set of methods and approaches to enhance the effectiveness of filtering and monitoring through a case study based on a 1094,914-word corpus of online testimonies relating to suicide. In doing so, we demonstrate how corpus methods and analysis of authentic language data can be used to identify and contextualise safeguarding concerns. The practical applications of this research are intended to help schools to better protect children from the illegal and legal (but harmful) online materials that currently pose a threat to their safety and wellbeing.</div></div>\",\"PeriodicalId\":72254,\"journal\":{\"name\":\"Applied Corpus Linguistics\",\"volume\":\"5 3\",\"pages\":\"Article 100149\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Corpus Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666799125000322\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Corpus Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666799125000322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
保护在校儿童广义上是指为保护儿童不受虐待、防止对健康和发展的损害以及促进改善儿童生活机会的条件而采取的行动。为了保护儿童,英国学校必须实施过滤和监控软件,以“阻止有害和不适当的内容,而不会不合理地影响教学”(Department for Education, 2024: 40)。监测学校在线语言使用的行业标准方法是“关键字监测”,即识别与特定形式的风险(例如暴力)相关的特定单词或短语(例如“炸弹”)的使用或存在。然而,这种方法通常依赖于与使用上下文分离的单词列表,并且只有在与“关键字”直接匹配时才会引起关注。这可能导致“误报”,即“关键字”匹配会引发自动保护问题(例如“炸弹”),即使关键字的使用是无害的(例如“沐浴炸弹”)。本文介绍了语料库语言学作为一套方法和途径,以提高过滤和监测的有效性,通过一个基于1094,914字的在线证词语料库与自杀相关的案例研究。在此过程中,我们展示了如何使用语料库方法和真实语言数据的分析来识别和情境化保护问题。这项研究的实际应用旨在帮助学校更好地保护儿童免受非法和合法(但有害)在线材料的侵害,这些材料目前对他们的安全和福祉构成威胁。
Corpus linguistics for safeguarding children online
Safeguarding children in schools broadly refers to the actions taken to protect children from abuse, prevent damage to health and development, and promote conditions that would improve the life chances of children. To safeguard children, UK schools must implement filtering and monitoring software to “block harmful and inappropriate content without unreasonably impacting teaching and learning” (Department for Education, 2024: 40). The industry standard method for monitoring online language use in schools is ‘keyword monitoring’, which identifies the use or presence of specific words or phrases (e.g. ‘bomb’) that correlate with a specific form of risk (e.g. violence). However, this approach typically depends on lists of words isolated from their context(s) of use and tends only to raise concerns if there is a direct match to a ‘keyword’. This can lead to ‘false positives’ whereby a 'keyword' match raises an automatic safeguarding concern (e.g. ‘bomb’) even if the use of the keyword was innocuous (e.g. ‘bath bomb’). This paper introduces corpus linguistics as a set of methods and approaches to enhance the effectiveness of filtering and monitoring through a case study based on a 1094,914-word corpus of online testimonies relating to suicide. In doing so, we demonstrate how corpus methods and analysis of authentic language data can be used to identify and contextualise safeguarding concerns. The practical applications of this research are intended to help schools to better protect children from the illegal and legal (but harmful) online materials that currently pose a threat to their safety and wellbeing.