Detecting Hate Speech in Hindi in Online Social Media

2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT) Pub Date : 2023-01-19 DOI:10.1109/ICCT56969.2023.10075749

Anushka Sharma, Rishabh Kaushal

{"title":"Detecting Hate Speech in Hindi in Online Social Media","authors":"Anushka Sharma, Rishabh Kaushal","doi":"10.1109/ICCT56969.2023.10075749","DOIUrl":null,"url":null,"abstract":"Because of the rise in online hatred, the research communities of artificial intelligence, particularly natural language processing, have been developing models for identifying online hatred. Recently, code-mixing, or the usage of multiple languages in social media conversations, has made multilingual hatred a significant difficulty for automated detection. The crucial task involved in NLP is identifying inciting hatred in writings on social networking sites. This work has several relevant applications, including analysis of sentiments, cyberbullying in online world, and societal & political conflict studies. Using tweets that have been put online on Twitter, we analyze the issue of hatred detection in multilingual functionality in this paper. The tweets have the text annotations and the speech category (Normal speech or Hate speech) to which these belong. We, therefore, recommend a monitored method for detecting hatred. Additionally, the classification approach is provided, which uses certain characters level, words level, and lexicons-based features for identifying hate speech in the corpus. We obtain results of 96% accuracy in identifying posts across four classifiers. Index Terms—Hate speech, Multilingual, Code-mixing, NLP","PeriodicalId":128100,"journal":{"name":"2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT)","volume":"100 9","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCT56969.2023.10075749","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Because of the rise in online hatred, the research communities of artificial intelligence, particularly natural language processing, have been developing models for identifying online hatred. Recently, code-mixing, or the usage of multiple languages in social media conversations, has made multilingual hatred a significant difficulty for automated detection. The crucial task involved in NLP is identifying inciting hatred in writings on social networking sites. This work has several relevant applications, including analysis of sentiments, cyberbullying in online world, and societal & political conflict studies. Using tweets that have been put online on Twitter, we analyze the issue of hatred detection in multilingual functionality in this paper. The tweets have the text annotations and the speech category (Normal speech or Hate speech) to which these belong. We, therefore, recommend a monitored method for detecting hatred. Additionally, the classification approach is provided, which uses certain characters level, words level, and lexicons-based features for identifying hate speech in the corpus. We obtain results of 96% accuracy in identifying posts across four classifiers. Index Terms—Hate speech, Multilingual, Code-mixing, NLP

查看原文本刊更多论文

在线社交媒体上印地语仇恨言论的检测

由于网络仇恨的增加，人工智能研究团体，特别是自然语言处理，一直在开发识别网络仇恨的模型。最近，代码混合，或在社交媒体对话中使用多种语言，使多语言仇恨成为自动检测的一个重大困难。NLP的关键任务是识别社交网站上煽动仇恨的文章。这项工作有几个相关的应用，包括情绪分析，网络世界中的网络欺凌，以及社会和政治冲突研究。本文利用Twitter上发布的推文，分析了多语言功能中的仇恨检测问题。tweet具有文本注释和所属的语音类别(正常语音或仇恨语音)。因此，我们推荐一种监测仇恨的方法。此外，还提出了一种分类方法，该方法使用一定的字符级、词级和基于词典的特征来识别语料库中的仇恨言论。我们在识别四个分类器的帖子中获得了96%的准确率。索引术语-仇恨言论，多语言，代码混合，自然语言处理

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT)

自引率

0.00%

发文量