一个综合的显性和隐性攻击性语言分类

Q2 Arts and Humanities
Barbara - Lewandowska-Tomaszczyk, A. Bączkowska, Chaya Liebeskind, Giedre Valunaite Oleskeviciene, Slavko Žitnik
{"title":"一个综合的显性和隐性攻击性语言分类","authors":"Barbara - Lewandowska-Tomaszczyk, A. Bączkowska, Chaya Liebeskind, Giedre Valunaite Oleskeviciene, Slavko Žitnik","doi":"10.1515/lpp-2023-0002","DOIUrl":null,"url":null,"abstract":"Abstract The current study represents an integrated model of explicit and implicit offensive language taxonomy. First, it focuses on a definitional revision and enrichment of the explicit offensive language taxonomy by reviewing the collection of available corpora and comparing tagging schemas applied there. The study relies mainly on the categories originally proposed by Zampieri et al. (2019) in terms of offensive language categorization schemata. After the explanation of semantic differences between particular concepts used in the tagging systems and the analysis of theoretical frameworks, a finite set of classes is presented, which cover aspects of offensive language representation along with linguistically sound explanations (Lewandowska-Tomaszczyk et al. 2021). In the analytic procedure, offensive from non-offensive discourse is first distinguished, with the question of offence Target and the following categorization levels and sublevels. Based on the relevant data generated from Sketch Engine (https://www.sketchengine.eu/ententen-english-corpus/), we propose the concept of offensive language as a superordinate category in our system with a number of hierarchically arranged 17 subcategories. The categories are taxonomically structured into 4 levels and verified with the use of neural-based (lexical) embeddings. Together with a taxonomy of implicit offensive language and its subcategorization levels which has received little scholarly attention until now, the categorization is exemplified in samples of offensive discourses in selected English social media materials, i.e., publicly available 25 web-based hate speech datasets (consult Appendix 1 for a complete list). The offensive category levels (types of offence, targets, etc.) and aspects (offensive language property clusters) as well as the categories of explicitness and implicitness are discussed in the study and the computationally verified integrated explicit and implicit offensive language taxonomy proposed in the study.","PeriodicalId":39423,"journal":{"name":"Lodz Papers in Pragmatics","volume":"19 1","pages":"7 - 48"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An integrated explicit and implicit offensive language taxonomy\",\"authors\":\"Barbara - Lewandowska-Tomaszczyk, A. Bączkowska, Chaya Liebeskind, Giedre Valunaite Oleskeviciene, Slavko Žitnik\",\"doi\":\"10.1515/lpp-2023-0002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The current study represents an integrated model of explicit and implicit offensive language taxonomy. First, it focuses on a definitional revision and enrichment of the explicit offensive language taxonomy by reviewing the collection of available corpora and comparing tagging schemas applied there. The study relies mainly on the categories originally proposed by Zampieri et al. (2019) in terms of offensive language categorization schemata. After the explanation of semantic differences between particular concepts used in the tagging systems and the analysis of theoretical frameworks, a finite set of classes is presented, which cover aspects of offensive language representation along with linguistically sound explanations (Lewandowska-Tomaszczyk et al. 2021). In the analytic procedure, offensive from non-offensive discourse is first distinguished, with the question of offence Target and the following categorization levels and sublevels. Based on the relevant data generated from Sketch Engine (https://www.sketchengine.eu/ententen-english-corpus/), we propose the concept of offensive language as a superordinate category in our system with a number of hierarchically arranged 17 subcategories. The categories are taxonomically structured into 4 levels and verified with the use of neural-based (lexical) embeddings. Together with a taxonomy of implicit offensive language and its subcategorization levels which has received little scholarly attention until now, the categorization is exemplified in samples of offensive discourses in selected English social media materials, i.e., publicly available 25 web-based hate speech datasets (consult Appendix 1 for a complete list). The offensive category levels (types of offence, targets, etc.) and aspects (offensive language property clusters) as well as the categories of explicitness and implicitness are discussed in the study and the computationally verified integrated explicit and implicit offensive language taxonomy proposed in the study.\",\"PeriodicalId\":39423,\"journal\":{\"name\":\"Lodz Papers in Pragmatics\",\"volume\":\"19 1\",\"pages\":\"7 - 48\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Lodz Papers in Pragmatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1515/lpp-2023-0002\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lodz Papers in Pragmatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/lpp-2023-0002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

摘要

摘要本研究提出了显性和隐性攻击性语言分类的综合模型。首先,通过回顾现有语料库的收集和比较在那里应用的标记模式,重点关注明确的攻击性语言分类的定义修订和丰富。本研究主要依赖于Zampieri et al.(2019)在攻击性语言分类图式方面最初提出的类别。在解释了标注系统中使用的特定概念之间的语义差异并分析了理论框架之后,提出了一组有限的类,涵盖了攻击性语言表示的各个方面以及语言学上的合理解释(Lewandowska-Tomaszczyk et al. 2021)。在分析过程中,首先将冒犯性话语与非冒犯性话语区分开来,并提出冒犯目标问题,以及随后的分类层次和子层次。基于Sketch Engine (https://www.sketchengine.eu/ententen-english-corpus/)生成的相关数据,我们提出了攻击性语言的概念,并将其作为我们系统中的一个上级类别,该类别由17个子类别分层排列。这些类别在分类上分为4个级别,并使用基于神经的(词汇)嵌入进行验证。与隐性冒犯性语言的分类及其子分类水平(迄今为止很少受到学术关注)一起,该分类在选定的英语社交媒体材料中的冒犯性话语样本中得到例证,即公开提供的25个基于网络的仇恨言论数据集(请参阅附录1以获取完整列表)。研究讨论了攻击性语言的范畴层次(攻击类型、攻击对象等)和方面(攻击性语言属性集群)以及显性和隐性的范畴,并提出了计算验证的显性和隐性综合攻击性语言分类法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An integrated explicit and implicit offensive language taxonomy
Abstract The current study represents an integrated model of explicit and implicit offensive language taxonomy. First, it focuses on a definitional revision and enrichment of the explicit offensive language taxonomy by reviewing the collection of available corpora and comparing tagging schemas applied there. The study relies mainly on the categories originally proposed by Zampieri et al. (2019) in terms of offensive language categorization schemata. After the explanation of semantic differences between particular concepts used in the tagging systems and the analysis of theoretical frameworks, a finite set of classes is presented, which cover aspects of offensive language representation along with linguistically sound explanations (Lewandowska-Tomaszczyk et al. 2021). In the analytic procedure, offensive from non-offensive discourse is first distinguished, with the question of offence Target and the following categorization levels and sublevels. Based on the relevant data generated from Sketch Engine (https://www.sketchengine.eu/ententen-english-corpus/), we propose the concept of offensive language as a superordinate category in our system with a number of hierarchically arranged 17 subcategories. The categories are taxonomically structured into 4 levels and verified with the use of neural-based (lexical) embeddings. Together with a taxonomy of implicit offensive language and its subcategorization levels which has received little scholarly attention until now, the categorization is exemplified in samples of offensive discourses in selected English social media materials, i.e., publicly available 25 web-based hate speech datasets (consult Appendix 1 for a complete list). The offensive category levels (types of offence, targets, etc.) and aspects (offensive language property clusters) as well as the categories of explicitness and implicitness are discussed in the study and the computationally verified integrated explicit and implicit offensive language taxonomy proposed in the study.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Lodz Papers in Pragmatics
Lodz Papers in Pragmatics Arts and Humanities-Language and Linguistics
CiteScore
1.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信