“Somewhere along your pedigree, a bitch got over the wall!” A proposal of implicitly offensive language typology

Q2 Arts and Humanities
Kristina Š. Despot, A. Anić, Tony Veale
{"title":"“Somewhere along your pedigree, a bitch got over the wall!” A proposal of implicitly offensive language typology","authors":"Kristina Š. Despot, A. Anić, Tony Veale","doi":"10.1515/lpp-2023-0019","DOIUrl":null,"url":null,"abstract":"Abstract The automatic detection of implicitly offensive language is a challenge for NLP, as such language is subtle, contextual, and plausibly deniable, but it is becoming increasingly important with the wider use of large language models to generate human-quality texts. This study argues that current difficulties in detecting implicit offence are exacerbated by multiple factors: (a) inadequate definitions of implicit and explicit offense; (b) an insufficient typology of implicit offence; and (c) a dearth of detailed analysis of implicitly offensive linguistic data. In this study, based on a qualitative analysis of an implicitly offensive dataset, a new typology of implicitly offensive language is proposed along with a detailed, example-led account of the new typology, an operational definition of implicitly offensive language, and a thorough analysis of the role of figurative language and humour in each type. Our analyses identify three main issues with previous datasets and typologies used in NLP approaches: (a) conflating content and form in the annotation; (b) treating figurativeness, particularly metaphor, as the main device of implicitness, while ignoring its equally important role in the explicit offence; and (c) an over-focus on form-specific datasets (e.g. focusing only on offensive comparisons), which fails to reflect the full complexity of offensive language use.","PeriodicalId":39423,"journal":{"name":"Lodz Papers in Pragmatics","volume":" 27","pages":"385 - 414"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lodz Papers in Pragmatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/lpp-2023-0019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The automatic detection of implicitly offensive language is a challenge for NLP, as such language is subtle, contextual, and plausibly deniable, but it is becoming increasingly important with the wider use of large language models to generate human-quality texts. This study argues that current difficulties in detecting implicit offence are exacerbated by multiple factors: (a) inadequate definitions of implicit and explicit offense; (b) an insufficient typology of implicit offence; and (c) a dearth of detailed analysis of implicitly offensive linguistic data. In this study, based on a qualitative analysis of an implicitly offensive dataset, a new typology of implicitly offensive language is proposed along with a detailed, example-led account of the new typology, an operational definition of implicitly offensive language, and a thorough analysis of the role of figurative language and humour in each type. Our analyses identify three main issues with previous datasets and typologies used in NLP approaches: (a) conflating content and form in the annotation; (b) treating figurativeness, particularly metaphor, as the main device of implicitness, while ignoring its equally important role in the explicit offence; and (c) an over-focus on form-specific datasets (e.g. focusing only on offensive comparisons), which fails to reflect the full complexity of offensive language use.
"在你血统的某处,一个婊子翻过了墙!"隐含攻击性语言类型学提案
自动检测隐式冒犯性语言是NLP面临的一个挑战,因为这种语言是微妙的,上下文的,并且似乎是可否认的,但随着大型语言模型在生成人类质量文本中的广泛使用,它变得越来越重要。本研究认为,多重因素加剧了当前隐性犯罪侦查的困难:(a)隐性犯罪和显性犯罪的定义不充分;(b)隐性罪行的类型不够;(三)缺乏对含蓄冒犯性语言数据的详细分析。在本研究中,基于对隐式冒犯性数据集的定性分析,提出了一种新的隐式冒犯性语言类型,并对新类型进行了详细的、以实例为主导的描述,对隐式冒犯性语言进行了操作定义,并对每种类型中比喻语言和幽默的作用进行了全面分析。我们的分析确定了NLP方法中使用的以前的数据集和类型学的三个主要问题:(a)在注释中合并内容和形式;(b)将比喻,特别是隐喻作为隐含的主要手段,而忽视了其在显性冒犯中的同等重要作用;(c)过度关注特定于形式的数据集(例如,只关注攻击性比较),这未能反映攻击性语言使用的全部复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Lodz Papers in Pragmatics
Lodz Papers in Pragmatics Arts and Humanities-Language and Linguistics
CiteScore
1.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信