Better Prevent than React: Deep Stratified Learning to Predict Hate Intensity of Twitter Reply Chains

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI:10.1109/ICDM51629.2021.00066

Dhruv Sahnan, Snehil Dahiya, Vasu Goel, Anil Bandhakavi, Tanmoy Chakraborty

{"title":"Better Prevent than React: Deep Stratified Learning to Predict Hate Intensity of Twitter Reply Chains","authors":"Dhruv Sahnan, Snehil Dahiya, Vasu Goel, Anil Bandhakavi, Tanmoy Chakraborty","doi":"10.1109/ICDM51629.2021.00066","DOIUrl":null,"url":null,"abstract":"Given a tweet, predicting the discussions that unfold around it is convoluted, to say the least. Most if not all of the discernibly benign tweets which seem innocuous may very well attract inflammatory posts (hate speech) from people who find them non-congenial. Therefore, building upon the aforementioned task and predicting if a tweet will incite hate speech is of critical importance. To stifle the dissemination of online hate speech is the need of the hour. Thus, there have been a handful of models for the detection of hate speech. Classical models work retrospectively by leveraging a reactive strategy – detection after the postage of hate speech, i.e., a backward trace after detection. Therefore, a benign post that may act as a surrogate to invoke toxicity in the near future, may not be flagged by the existing hate speech detection models. In this paper, we address this problem through a proactive strategy initiated to avert hate crime. We propose DRAGNET, a deep stratified learning framework which predicts the intensity of hatred that a root tweet can fetch through its subsequent replies. We extend the collection of social media discourse from our earlier work [1], comprising the entire reply chains up to $\\sim$5k root tweets catalogued into four controversial topics Similar to [1], we notice a handful of cases where despite the root tweets being non-hateful, the succeeding replies inject an enormous amount of toxicity into the discussions. DRAGNET turns out to be highly effective, significantly outperforming six state-of-the-art baselines. It beats the best baseline with an increase of 9.4% in the Pearson correlation coefficient and a decrease of 19% in Root Mean Square Error. Further, DRAGNET’S deployment in Logically’s advanced AI platform designed to monitor real-world problematic and hateful narratives has improved the aggregated insights extracted for understanding their spread, influence and thereby offering actionable intelligence to counter them","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Given a tweet, predicting the discussions that unfold around it is convoluted, to say the least. Most if not all of the discernibly benign tweets which seem innocuous may very well attract inflammatory posts (hate speech) from people who find them non-congenial. Therefore, building upon the aforementioned task and predicting if a tweet will incite hate speech is of critical importance. To stifle the dissemination of online hate speech is the need of the hour. Thus, there have been a handful of models for the detection of hate speech. Classical models work retrospectively by leveraging a reactive strategy – detection after the postage of hate speech, i.e., a backward trace after detection. Therefore, a benign post that may act as a surrogate to invoke toxicity in the near future, may not be flagged by the existing hate speech detection models. In this paper, we address this problem through a proactive strategy initiated to avert hate crime. We propose DRAGNET, a deep stratified learning framework which predicts the intensity of hatred that a root tweet can fetch through its subsequent replies. We extend the collection of social media discourse from our earlier work [1], comprising the entire reply chains up to $\sim$5k root tweets catalogued into four controversial topics Similar to [1], we notice a handful of cases where despite the root tweets being non-hateful, the succeeding replies inject an enormous amount of toxicity into the discussions. DRAGNET turns out to be highly effective, significantly outperforming six state-of-the-art baselines. It beats the best baseline with an increase of 9.4% in the Pearson correlation coefficient and a decrease of 19% in Root Mean Square Error. Further, DRAGNET’S deployment in Logically’s advanced AI platform designed to monitor real-world problematic and hateful narratives has improved the aggregated insights extracted for understanding their spread, influence and thereby offering actionable intelligence to counter them

查看原文本刊更多论文

预防胜于反应:深度分层学习预测Twitter回复链的仇恨强度

考虑到一条推文，至少可以说，预测围绕它展开的讨论是令人费解的。大多数看似无害的温和推文(如果不是全部的话)很可能会吸引那些认为它们不合时宜的人发布煽动性的帖子(仇恨言论)。因此，在上述任务的基础上预测推文是否会煽动仇恨言论是至关重要的。遏制网上仇恨言论的传播是当务之急。因此，已经有了一些检测仇恨言论的模型。经典模型通过利用一种反应性策略来追溯工作——在仇恨言论发布后进行检测，即在检测后进行反向追踪。因此，在不久的将来，一个善意的帖子可能会成为引发毒性的替代品，可能不会被现有的仇恨言论检测模型标记出来。在本文中，我们通过积极主动的策略来解决这个问题，以避免仇恨犯罪。我们提出了DRAGNET，这是一个深度分层学习框架，它可以预测根tweet可以通过随后的回复获得的仇恨强度。我们从我们早期的工作中扩展了社交媒体话语的集合[1]，包括整个回复链，最高可达$ $ $5k根推文，分为四个有争议的话题。与[1]类似，我们注意到少数情况下，尽管根推文是非仇恨的，但随后的回复给讨论注入了大量的毒性。DRAGNET被证明是非常有效的，显著优于六个最先进的基线。它优于最佳基线，Pearson相关系数增加了9.4%，均方根误差减少了19%。此外，DRAGNET在logic先进的人工智能平台上的部署，旨在监控现实世界中的问题和仇恨叙事，提高了提取的汇总见解，以了解其传播和影响，从而提供可操作的情报来对抗它们

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Data Mining (ICDM)

自引率

0.00%

发文量