基于深度学习模型和网络抓取的区域性推特仇恨言论分析与检测新系统

Machine Learning and Soft Computing Pub Date : 2023-01-28 DOI:10.5121/csit.2023.130207

Nicole Ma, Yu Sun

{"title":"基于深度学习模型和网络抓取的区域性推特仇恨言论分析与检测新系统","authors":"Nicole Ma, Yu Sun","doi":"10.5121/csit.2023.130207","DOIUrl":null,"url":null,"abstract":"Instances of hate speech on popular social media platforms such as Twitter are becoming increasingly common and intense. However, there still exists a lack of comprehensive deeplearning models to combat Twitter hate speech. In this project, a comprehensive detection and reporting platform, entitled “TweetWatch,” was created to solve this issue. A binary classification CNN (Convolutional Neural Network) and a multi-class CNN were created to detect hate speech from real-time Twitter data and classify tweets with hate speech into five categories. The binary classification model has an AUC score of 98.95% and an F1 score of 97.88%. The multi-class classification model has an AUC score of 89.46%. All metrics reached over a targeted 5% increase from previous models in multiple papers, validating the proposed solution. Additionally, the only real-time choropleth map for hate speech in the United States was successfully created.","PeriodicalId":132577,"journal":{"name":"Machine Learning and Soft Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel System for Regional Twitter Hate Speech Analysis and Detection using Deep Learning Models and Web Scraping\",\"authors\":\"Nicole Ma, Yu Sun\",\"doi\":\"10.5121/csit.2023.130207\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Instances of hate speech on popular social media platforms such as Twitter are becoming increasingly common and intense. However, there still exists a lack of comprehensive deeplearning models to combat Twitter hate speech. In this project, a comprehensive detection and reporting platform, entitled “TweetWatch,” was created to solve this issue. A binary classification CNN (Convolutional Neural Network) and a multi-class CNN were created to detect hate speech from real-time Twitter data and classify tweets with hate speech into five categories. The binary classification model has an AUC score of 98.95% and an F1 score of 97.88%. The multi-class classification model has an AUC score of 89.46%. All metrics reached over a targeted 5% increase from previous models in multiple papers, validating the proposed solution. Additionally, the only real-time choropleth map for hate speech in the United States was successfully created.\",\"PeriodicalId\":132577,\"journal\":{\"name\":\"Machine Learning and Soft Computing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning and Soft Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5121/csit.2023.130207\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2023.130207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在Twitter等热门社交媒体平台上，仇恨言论变得越来越普遍和激烈。然而，目前仍然缺乏全面的深度学习模型来对抗推特上的仇恨言论。在这个项目中，我们创建了一个名为“TweetWatch”的综合检测和报告平台来解决这个问题。创建了一个二元分类CNN(卷积神经网络)和一个多分类CNN，从实时Twitter数据中检测仇恨言论，并将含有仇恨言论的推文分为五类。二元分类模型的AUC得分为98.95%，F1得分为97.88%。多类分类模型的AUC得分为89.46%。在多篇论文中，所有指标都比之前的模型增加了5%以上的目标，验证了所建议的解决方案。此外，成功创建了美国唯一的仇恨言论实时地图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Novel System for Regional Twitter Hate Speech Analysis and Detection using Deep Learning Models and Web Scraping

Instances of hate speech on popular social media platforms such as Twitter are becoming increasingly common and intense. However, there still exists a lack of comprehensive deeplearning models to combat Twitter hate speech. In this project, a comprehensive detection and reporting platform, entitled “TweetWatch,” was created to solve this issue. A binary classification CNN (Convolutional Neural Network) and a multi-class CNN were created to detect hate speech from real-time Twitter data and classify tweets with hate speech into five categories. The binary classification model has an AUC score of 98.95% and an F1 score of 97.88%. The multi-class classification model has an AUC score of 89.46%. All metrics reached over a targeted 5% increase from previous models in multiple papers, validating the proposed solution. Additionally, the only real-time choropleth map for hate speech in the United States was successfully created.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Learning and Soft Computing

自引率

0.00%

发文量