The 7th Workshop on Online Abuse and Harms (WOAH)最新文献

筛选
英文 中文
Factoring Hate Speech: A New Annotation Framework to Study Hate Speech in Social Media 分解仇恨言论:研究社交媒体中仇恨言论的新注释框架
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 2023-11-07 DOI: 10.18653/v1/2023.woah-1.21
Gal Ron, Effi Levi, Odelia Oshri, Shaul R. Shenhav
{"title":"Factoring Hate Speech: A New Annotation Framework to Study Hate Speech in Social Media","authors":"Gal Ron, Effi Levi, Odelia Oshri, Shaul R. Shenhav","doi":"10.18653/v1/2023.woah-1.21","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.21","url":null,"abstract":"In this work we propose a novel annotation scheme which factors hate speech into five separate discursive categories. To evaluate our scheme, we construct a corpus of over 2.9M Twitter posts containing hateful expressions directed at Jews, and annotate a sample dataset of 1,050 tweets. We present a statistical analysis of the annotated dataset as well as discuss annotation examples, and conclude by discussing promising directions for future work.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134639730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking Offensive and Abusive Language in Dutch Tweets 对荷兰语推特中的攻击性和辱骂性语言进行基准测试
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.7
Tommaso Caselli, H. van der Veen
{"title":"Benchmarking Offensive and Abusive Language in Dutch Tweets","authors":"Tommaso Caselli, H. van der Veen","doi":"10.18653/v1/2023.woah-1.7","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.7","url":null,"abstract":"We present an extensive evaluation of different fine-tuned models to detect instances of offensive and abusive language in Dutch across three benchmarks: a standard held-out test, a task- agnostic functional benchmark, and a dynamic test set. We also investigate the use of data cartography to identify high quality training data. Our results show a relatively good quality of the manually annotated data used to train the models while highlighting some critical weakness. We have also found a good portability of trained models along the same language phenomena. As for the data cartography, we have found a positive impact only on the functional benchmark and when selecting data per annotated dimension rather than using the entire training material.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114564769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HOMO-MEX: A Mexican Spanish Annotated Corpus for LGBT+phobia Detection on Twitter HOMO-MEX:用于Twitter上LGBT+恐惧症检测的墨西哥西班牙语注释语料库
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.20
Juan Vásquez, S. Andersen, G. Bel-Enguix, Helena Gómez-Adorno, Sergio-Luis Ojeda-Trueba
{"title":"HOMO-MEX: A Mexican Spanish Annotated Corpus for LGBT+phobia Detection on Twitter","authors":"Juan Vásquez, S. Andersen, G. Bel-Enguix, Helena Gómez-Adorno, Sergio-Luis Ojeda-Trueba","doi":"10.18653/v1/2023.woah-1.20","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.20","url":null,"abstract":"In the past few years, the NLP community has actively worked on detecting LGBT+Phobia in online spaces, using textual data publicly available Most of these are for the English language and its variants since it is the most studied language by the NLP community. Nevertheless, efforts towards creating corpora in other languages are active worldwide. Despite this, the Spanish language is an understudied language regarding digital LGBT+Phobia. The only corpus we found in the literature was for the Peninsular Spanish dialects, which use LGBT+phobic terms different than those in the Mexican dialect. For this reason, we present Homo-MEX, a novel corpus for detecting LGBT+Phobia in Mexican Spanish. In this paper, we describe our data-gathering and annotation process. Also, we present a classification benchmark using various traditional machine learning algorithms and two pre-trained deep learning models to showcase our corpus classification potential.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128239828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech 恭敬还是有害?使用零射击学习和语言模型来检测仇恨言论
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.6
F. Plaza-Del-Arco, Debora Nozza, Dirk Hovy
{"title":"Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech","authors":"F. Plaza-Del-Arco, Debora Nozza, Dirk Hovy","doi":"10.18653/v1/2023.woah-1.6","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.6","url":null,"abstract":"Hate speech detection faces two significant challenges: 1) the limited availability of labeled data and 2) the high variability of hate speech across different contexts and languages. Prompting brings a ray of hope to these challenges. It allows injecting a model with task-specific knowledge without relying on labeled data. This paper explores zero-shot learning with prompting for hate speech detection. We investigate how well zero-shot learning can detect hate speech in 3 languages with limited labeled data. We experiment with various large language models and verbalizers on 8 benchmark datasets. Our findings highlight the impact of prompt selection on the results. They also suggest that prompting, specifically with recent large language models, can achieve performance comparable to and surpass fine-tuned models, making it a promising alternative for under-resourced languages. Our findings highlight the potential of prompting for hate speech detection and show how both the prompt and the model have a significant impact on achieving more accurate predictions in this task.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130597244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards Safer Communities: Detecting Aggression and Offensive Language in Code-Mixed Tweets to Combat Cyberbullying 迈向更安全的社区:检测混合代码推文中的攻击性和攻击性语言以打击网络欺凌
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.3
Nazia Nafis, Diptesh Kanojia, Naveen Saini, Rudra Murthy
{"title":"Towards Safer Communities: Detecting Aggression and Offensive Language in Code-Mixed Tweets to Combat Cyberbullying","authors":"Nazia Nafis, Diptesh Kanojia, Naveen Saini, Rudra Murthy","doi":"10.18653/v1/2023.woah-1.3","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.3","url":null,"abstract":"Cyberbullying is a serious societal issue widespread on various channels and platforms, particularly social networking sites. Such platforms have proven to be exceptionally fertile grounds for such behavior. The dearth of high-quality training data for multilingual and low-resource scenarios, data that can accurately capture the nuances of social media conversations, often poses a roadblock to this task. This paper attempts to tackle cyberbullying, specifically its two most common manifestations - aggression and offensiveness. We present a novel, manually annotated dataset of a total of 10,000 English and Hindi-English code-mixed tweets, manually annotated for aggression detection and offensive language detection tasks. Our annotations are supported by inter-annotator agreement scores of 0.67 and 0.74 for the two tasks, indicating substantial agreement. We perform comprehensive fine-tuning of pre-trained language models (PTLMs) using this dataset to check its efficacy. Our challenging test sets show that the best models achieve macro F1-scores of 67.87 and 65.45 on the two tasks, respectively. Further, we perform cross-dataset transfer learning to benchmark our dataset against existing aggression and offensive language datasets. We also present a detailed quantitative and qualitative analysis of errors in prediction, and with this paper, we publicly release the novel dataset, code, and models.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"515 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132862017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resources for Automated Identification of Online Gender-Based Violence: A Systematic Review 网络性别暴力自动识别资源:系统回顾
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.17
Gavin Abercrombie, Aiqi Jiang, Poppy Gerrard-abbott, Ioannis Konstas, Verena Rieser
{"title":"Resources for Automated Identification of Online Gender-Based Violence: A Systematic Review","authors":"Gavin Abercrombie, Aiqi Jiang, Poppy Gerrard-abbott, Ioannis Konstas, Verena Rieser","doi":"10.18653/v1/2023.woah-1.17","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.17","url":null,"abstract":"Online Gender-Based Violence (GBV), such as misogynistic abuse is an increasingly prevalent problem that technological approaches have struggled to address.Through the lens of the GBV framework, which is rooted in social science and policy, we systematically review 63 available resources for automated identification of such language. We find the datasets are limited in a number of important ways, such as their lack of theoretical grounding and stakeholder input, static nature, and focus on certain media platforms. Based on this review, we recommend development of future resources rooted in sociological expertise andcentering stakeholder voices, namely GBV experts and people with lived experience of GBV.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121679081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Aporophobia: An Overlooked Type of Toxic Language Targeting the Poor 恐空症:一种被忽视的针对穷人的有毒语言
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.12
S. Kiritchenko, Georgina Curto Rex, I. Nejadgholi, Kathleen C. Fraser
{"title":"Aporophobia: An Overlooked Type of Toxic Language Targeting the Poor","authors":"S. Kiritchenko, Georgina Curto Rex, I. Nejadgholi, Kathleen C. Fraser","doi":"10.18653/v1/2023.woah-1.12","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.12","url":null,"abstract":"While many types of hate speech and online toxicity have been the focus of extensive research in NLP, toxic language stigmatizing poor people has been mostly disregarded. Yet, aporophobia, a social bias against the poor, is a common phenomenon online, which can be psychologically damaging as well as hindering poverty reduction policy measures. We demonstrate that aporophobic attitudes are indeed present in social media and argue that the existing NLP datasets and models are inadequate to effectively address this problem. Efforts toward designing specialized resources and novel socio-technical mechanisms for confronting aporophobia are needed.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125829692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relationality and Offensive Speech: A Research Agenda 关系与攻击性言语:一个研究议程
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.8
Razvan Amironesei, Mark Díaz
{"title":"Relationality and Offensive Speech: A Research Agenda","authors":"Razvan Amironesei, Mark Díaz","doi":"10.18653/v1/2023.woah-1.8","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.8","url":null,"abstract":"We draw from the framework of relationality as a pathway for modeling social relations to address gaps in text classification, generally, and offensive language classification, specifically. We use minoritized language, such as queer speech, to motivate a need for understanding and modeling social relations–both among individuals and among their social communities. We then point to socio-ethical style as a research area for inferring and measuring social relations as well as propose additional questions to structure future research on operationalizing social context.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"47 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131992346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
“Female Astronaut: Because sandwiches won’t make themselves up there”: Towards Multimodal misogyny detection in memes “女宇航员:因为三明治不会在那里自己做”:在表情包中发现多模态厌女症
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.15
Smriti Singh, Amritha Haridasan, R. Mooney
{"title":"“Female Astronaut: Because sandwiches won’t make themselves up there”: Towards Multimodal misogyny detection in memes","authors":"Smriti Singh, Amritha Haridasan, R. Mooney","doi":"10.18653/v1/2023.woah-1.15","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.15","url":null,"abstract":"A rise in the circulation of memes has led to the spread of a new form of multimodal hateful content. Unfortunately, the degree of hate women receive on the internet is disproportionately skewed against them. This, combined with the fact that multimodal misogyny is more challenging to detect as opposed to traditional text-based misogyny, signifies that the task of identifying misogynistic memes online is one of utmost importance. To this end, the MAMI dataset was released, consisting of 12000 memes annotated for misogyny and four sub-classes of misogyny - shame, objectification, violence and stereotype. While this balanced dataset is widely cited, we find that the task itself remains largely unsolved. Thus, in our work, we investigate the performance of multiple models in an effort to analyse whether domain specific pretraining helps model performance. We also investigate why even state of the art models find this task so challenging, and whether domain-specific pretraining can help. Our results show that pretraining BERT on hateful memes and leveraging an attention based approach with ViT outperforms state of the art models by more than 10%. Further, we provide insight into why these models may be struggling with this task with an extensive qualitative analysis of random samples from the test set.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132810091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harmful Language Datasets: An Assessment of Robustness 有害语言数据集:鲁棒性评估
The 7th Workshop on Online Abuse and Harms (WOAH) Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.woah-1.24
Katerina Korre, John Pavlopoulos, Jeffrey Sorensen, Leo Laugier, I. Androutsopoulos, Lucas Dixon, Alberto Barrón-Cedeño
{"title":"Harmful Language Datasets: An Assessment of Robustness","authors":"Katerina Korre, John Pavlopoulos, Jeffrey Sorensen, Leo Laugier, I. Androutsopoulos, Lucas Dixon, Alberto Barrón-Cedeño","doi":"10.18653/v1/2023.woah-1.24","DOIUrl":"https://doi.org/10.18653/v1/2023.woah-1.24","url":null,"abstract":"The automated detection of harmful language has been of great importance for the online world, especially with the growing importance of social media and, consequently, polarisation. There are many open challenges to high quality detection of harmful text, from dataset creation to generalisable application, thus calling for more systematic studies. In this paper, we explore re-annotation as a means of examining the robustness of already existing labelled datasets, showing that, despite using alternative definitions, the inter-annotator agreement remains very inconsistent, highlighting the intrinsically subjective and variable nature of the task. In addition, we build automatic toxicity detectors using the existing datasets, with their original labels, and we evaluate them on our multi-definition and multi-source datasets. Surprisingly, while other studies show that hate speech detection models perform better on data that are derived from the same distribution as the training set, our analysis demonstrates this is not necessarily true.","PeriodicalId":378248,"journal":{"name":"The 7th Workshop on Online Abuse and Harms (WOAH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129682623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信