Developed Models Based on Transfer Learning for Improving Fake News Predictions

Tahseen A. Wotaifi, B. N. Dhannoon
{"title":"Developed Models Based on Transfer Learning for Improving Fake News Predictions","authors":"Tahseen A. Wotaifi, B. N. Dhannoon","doi":"10.3897/jucs.94081","DOIUrl":null,"url":null,"abstract":" In conjunction with the global concern regarding the spread of fake news on social media, there is a large flow of research to address this phenomenon. The wide growth in social media and online forums has made it easy for legitimate news to merge with comprehensive misleading news, negatively affecting people’s perceptions and misleading them. As such, this study aims to use deep learning, pre-trained models, and machine learning to predict Arabic and English fake news based on three public and available datasets: the Fake-or-Real dataset, the AraNews dataset, and the Sentimental LIAR dataset. Based on GloVe (Global Vectors) and FastText pre-trained models, A hybrid network has been proposed to improve the prediction of fake news. In this proposed network, CNN (Convolution Neural Network) was used to identify the most important features. In contrast, BiGRU (Bidirectional Gated Recurrent Unit) was used to measure the long-term dependency of sequences. Finally, multi-layer perceptron (MLP) is applied to classify the article news as fake or real. On the other hand, an Improved Random Forest Model is built based on the embedding values extracted from BERT (Bidirectional Encoder Representations from Transformers) pre-trained model and the relevant speaker-based features. These relevant features are identified by a fuzzy model based on feature selection methods. Accuracy was used as a measure of the quality of our proposed models, whereby the prediction accuracy reached 0.9935, 0.9473, and 0.7481 for the Fake-or-Real dataset, AraNews dataset, and Sentimental LAIR dataset respectively. The proposed models showed a significant improvement in the accuracy of predicting Arabic and English fake news compared to previous studies that used the same datasets. ","PeriodicalId":14652,"journal":{"name":"J. Univers. Comput. Sci.","volume":"25 1","pages":"491-507"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Univers. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/jucs.94081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

 In conjunction with the global concern regarding the spread of fake news on social media, there is a large flow of research to address this phenomenon. The wide growth in social media and online forums has made it easy for legitimate news to merge with comprehensive misleading news, negatively affecting people’s perceptions and misleading them. As such, this study aims to use deep learning, pre-trained models, and machine learning to predict Arabic and English fake news based on three public and available datasets: the Fake-or-Real dataset, the AraNews dataset, and the Sentimental LIAR dataset. Based on GloVe (Global Vectors) and FastText pre-trained models, A hybrid network has been proposed to improve the prediction of fake news. In this proposed network, CNN (Convolution Neural Network) was used to identify the most important features. In contrast, BiGRU (Bidirectional Gated Recurrent Unit) was used to measure the long-term dependency of sequences. Finally, multi-layer perceptron (MLP) is applied to classify the article news as fake or real. On the other hand, an Improved Random Forest Model is built based on the embedding values extracted from BERT (Bidirectional Encoder Representations from Transformers) pre-trained model and the relevant speaker-based features. These relevant features are identified by a fuzzy model based on feature selection methods. Accuracy was used as a measure of the quality of our proposed models, whereby the prediction accuracy reached 0.9935, 0.9473, and 0.7481 for the Fake-or-Real dataset, AraNews dataset, and Sentimental LAIR dataset respectively. The proposed models showed a significant improvement in the accuracy of predicting Arabic and English fake news compared to previous studies that used the same datasets. 
基于迁移学习的假新闻预测模型
随着全球对社交媒体上假新闻传播的关注,有大量的研究来解决这一现象。社交媒体和网络论坛的广泛发展,使得合法的新闻很容易与全面的误导性新闻融合在一起,对人们的认知产生负面影响,误导人们。因此,本研究旨在使用深度学习、预训练模型和机器学习来预测基于三个公开和可用数据集的阿拉伯语和英语假新闻:假或真数据集、AraNews数据集和感伤骗子数据集。基于GloVe (Global Vectors)和FastText预训练模型,提出了一种改进假新闻预测的混合网络。在该网络中,使用CNN(卷积神经网络)来识别最重要的特征。相比之下,BiGRU(双向门控循环单元)用于测量序列的长期依赖性。最后,应用多层感知器(MLP)对文章新闻进行真假分类。另一方面,基于BERT (Bidirectional Encoder Representations from Transformers)预训练模型提取的嵌入值和相关的基于说话人的特征,构建改进的随机森林模型。通过基于特征选择方法的模糊模型识别这些相关特征。准确度被用来衡量我们提出的模型的质量,其中,Fake-or-Real数据集、AraNews数据集和Sentimental LAIR数据集的预测准确度分别达到0.9935、0.9473和0.7481。与之前使用相同数据集的研究相比,所提出的模型在预测阿拉伯语和英语假新闻的准确性方面有显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信