Demystifying Black-box Learning Models of Rumor Detection from Social Media Posts

2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) Pub Date : 2021-12-01 DOI:10.1109/uemcon53757.2021.9666567

Faiza Tafannum, Mir Nafis Sharear Shopnil, Anika Salsabil, Navid Ahmed, Md. Golam Rabiul Alam, Md. Tanzim Reza

{"title":"Demystifying Black-box Learning Models of Rumor Detection from Social Media Posts","authors":"Faiza Tafannum, Mir Nafis Sharear Shopnil, Anika Salsabil, Navid Ahmed, Md. Golam Rabiul Alam, Md. Tanzim Reza","doi":"10.1109/uemcon53757.2021.9666567","DOIUrl":null,"url":null,"abstract":"Social media and its users are vulnerable to the spread of rumors, therefore, protecting users from the spread of rumors is extremely important. For this reason, we propose a novel approach for rumor detection in social media that consists of multiple robust models: XGBoost Classifier, Support Vector Machine, Random Forest Classifier, Extra Tree Classifier, Decision Tree Classifier, a hybrid model, deep learning models-LSTM and BERT. For evaluation, two datasets are used. These artificial intelligence algorithms are often referred to as \"Blackbox\" where data go in the box and predictions come out of the box but what is happening inside the box frequently remains cloudy. Although, there have been several works on detecting fake news, the number of works regarding rumor detection is still limited and the models used in the existing works do not explain their decision-making process. We take models with higher accuracy to illustrate which feature of the data contributes the most for a post to have been predicted as a rumor or a non-rumor by the models to explain the opaque process happening inside the black-box models. Our hybrid model achieves an accuracy of 93.22% and 82.49%, while LSTM provides 99.81%, 98.41% and BERT provides 99.62%, 94.80% accuracy scores on the COVID19 Fake News and the concatenation of Twitter15 and Twitter16 datasets respectively.","PeriodicalId":127072,"journal":{"name":"2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/uemcon53757.2021.9666567","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Social media and its users are vulnerable to the spread of rumors, therefore, protecting users from the spread of rumors is extremely important. For this reason, we propose a novel approach for rumor detection in social media that consists of multiple robust models: XGBoost Classifier, Support Vector Machine, Random Forest Classifier, Extra Tree Classifier, Decision Tree Classifier, a hybrid model, deep learning models-LSTM and BERT. For evaluation, two datasets are used. These artificial intelligence algorithms are often referred to as "Blackbox" where data go in the box and predictions come out of the box but what is happening inside the box frequently remains cloudy. Although, there have been several works on detecting fake news, the number of works regarding rumor detection is still limited and the models used in the existing works do not explain their decision-making process. We take models with higher accuracy to illustrate which feature of the data contributes the most for a post to have been predicted as a rumor or a non-rumor by the models to explain the opaque process happening inside the black-box models. Our hybrid model achieves an accuracy of 93.22% and 82.49%, while LSTM provides 99.81%, 98.41% and BERT provides 99.62%, 94.80% accuracy scores on the COVID19 Fake News and the concatenation of Twitter15 and Twitter16 datasets respectively.

查看原文本刊更多论文

从社交媒体帖子中解密谣言检测的黑箱学习模型

社交媒体及其用户极易受到谣言传播的影响，因此保护用户不受谣言传播的影响极为重要。因此，我们提出了一种新的社交媒体谣言检测方法，该方法由多个鲁棒模型组成:XGBoost分类器、支持向量机、随机森林分类器、额外树分类器、决策树分类器、混合模型、深度学习模型- lstm和BERT。为了进行评估，使用了两个数据集。这些人工智能算法通常被称为“黑盒子”，即数据放在盒子里，预测从盒子里出来，但盒子里发生的事情往往是不确定的。虽然已经有一些关于假新闻检测的工作，但是关于谣言检测的工作数量仍然有限，现有工作中使用的模型并不能解释他们的决策过程。我们采用更高精度的模型来说明数据的哪个特征对模型预测为谣言或非谣言的帖子贡献最大，以解释黑箱模型中发生的不透明过程。我们的混合模型在covid - 19假新闻和Twitter15和Twitter16数据集的拼接上分别获得了93.22%和82.49%的准确率，而LSTM和BERT分别提供了99.81%、98.41%和99.62%、94.80%的准确率分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)

自引率

0.00%

发文量