Using Supervised Learning Models for Creating a New Fake News Analysis and Classification of a COVID-19 Dataset: A case study on Covid-19 in Iran

Mohammadreza Parvizimosaed, M. Esnaashari, A. Damia, Razieh Bahmanyar
{"title":"Using Supervised Learning Models for Creating a New Fake News Analysis and Classification of a COVID-19 Dataset: A case study on Covid-19 in Iran","authors":"Mohammadreza Parvizimosaed, M. Esnaashari, A. Damia, Razieh Bahmanyar","doi":"10.1109/ICWR54782.2022.9786244","DOIUrl":null,"url":null,"abstract":"Today, the growth of the coronavirus as a pandemic and its global expansion is a significant concern in our society and the international community. However, in recent years, many individuals have shifted their major source of news and information to social networks. Consequently, the widespread dissemination of false and misleading information on social media is significant for most politicians. Our effort is not only against COVID-19 but against an “infodemic” as well. To address this, on COVID-19, we have collected and released a labeled dataset of 7,000 social media postings Persian data, and articles of authentic and false news. Covid 19 fake news has been detected in other languages such as Arabic, English, Chinese, and Hindi. We execute a multi-label task (actual vs. fictitious) on the labeled dataset and compare it to six machine learning baselines: Logistic Regression, Support Vector Machine, Decision Tree, Naive Bayes, K-Nearest Neighbors, and Random Forest. On the test set, the support vector machine gives us the best results, with an 89 percent accuracy rate.","PeriodicalId":355187,"journal":{"name":"2022 8th International Conference on Web Research (ICWR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 8th International Conference on Web Research (ICWR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWR54782.2022.9786244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Today, the growth of the coronavirus as a pandemic and its global expansion is a significant concern in our society and the international community. However, in recent years, many individuals have shifted their major source of news and information to social networks. Consequently, the widespread dissemination of false and misleading information on social media is significant for most politicians. Our effort is not only against COVID-19 but against an “infodemic” as well. To address this, on COVID-19, we have collected and released a labeled dataset of 7,000 social media postings Persian data, and articles of authentic and false news. Covid 19 fake news has been detected in other languages such as Arabic, English, Chinese, and Hindi. We execute a multi-label task (actual vs. fictitious) on the labeled dataset and compare it to six machine learning baselines: Logistic Regression, Support Vector Machine, Decision Tree, Naive Bayes, K-Nearest Neighbors, and Random Forest. On the test set, the support vector machine gives us the best results, with an 89 percent accuracy rate.
使用监督学习模型创建新的假新闻对COVID-19数据集进行分析和分类:以伊朗COVID-19为例
今天,冠状病毒发展为大流行并在全球蔓延,是我国社会和国际社会的一个重大关切。然而,近年来,许多人将他们的主要新闻和信息来源转移到社交网络上。因此,社交媒体上虚假和误导性信息的广泛传播对大多数政治家来说都很重要。我们不仅要抗击新冠肺炎,还要抗击“信息大流行”。为了解决这一问题,我们收集并发布了7000个社交媒体帖子的标签数据集,其中包括波斯语数据以及真假新闻。阿拉伯语、英语、汉语、印地语等其他语言也出现了假新闻。我们在标记的数据集上执行多标签任务(实际与虚构),并将其与六种机器学习基线进行比较:逻辑回归、支持向量机、决策树、朴素贝叶斯、k近邻和随机森林。在测试集上,支持向量机给了我们最好的结果,准确率为89%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信