社交媒体上的自动谣言检测模型

2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC) Pub Date : 2020-11-06 DOI:10.1109/PDGC50313.2020.9315738

M. Bharti, Himanshu Jindal

{"title":"社交媒体上的自动谣言检测模型","authors":"M. Bharti, Himanshu Jindal","doi":"10.1109/PDGC50313.2020.9315738","DOIUrl":null,"url":null,"abstract":"Social networking site Twitter, in particular, has become a popular spot for gossip. Rumors or false news spread very easily through the Twitter network by re-tweeting users without understanding the real truth. These reports trigger popular confusion, threaten the authority of the government and pose a major threat to social order. It is also a very necessary job to dispel theories as quickly as possible. In this research, multiple descriptive and consumer-based features via tweets are retrieved and integrated these features with the TF-IDF system to develop a composite set of features. This composite set of features is then used by several machine learning techniques like Support Vector Machine (SVM), Linear regression, K-Nearest Neighbor (KNN), Naive Bayes, Decision Tree, Random Forest, and Gradient Boosting. Along with these machine learning classification models, a Convolutional Neural Network (CNN) algorithm is proposed to distinguish rumour and non-rumor tweets. The proposed model is evaluated with freely accessible twitter datasets. The existing machine-based learning models have acquired an Fl-score of 0.46 to 0.76 for rumour detection, while the CNN model attained an Fl-score of 0.77 for rumour class. Overall, the CNN model yields greater results with a weighted average Fl-score of 0.84 for both rumour and non-rumor categories. The potential mechanism will help to detect misinformation as quickly as possible to counteract the dissemination of rumours and build users' deep confidence in social media sites.","PeriodicalId":347216,"journal":{"name":"2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Automatic Rumour Detection Model on Social Media\",\"authors\":\"M. Bharti, Himanshu Jindal\",\"doi\":\"10.1109/PDGC50313.2020.9315738\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social networking site Twitter, in particular, has become a popular spot for gossip. Rumors or false news spread very easily through the Twitter network by re-tweeting users without understanding the real truth. These reports trigger popular confusion, threaten the authority of the government and pose a major threat to social order. It is also a very necessary job to dispel theories as quickly as possible. In this research, multiple descriptive and consumer-based features via tweets are retrieved and integrated these features with the TF-IDF system to develop a composite set of features. This composite set of features is then used by several machine learning techniques like Support Vector Machine (SVM), Linear regression, K-Nearest Neighbor (KNN), Naive Bayes, Decision Tree, Random Forest, and Gradient Boosting. Along with these machine learning classification models, a Convolutional Neural Network (CNN) algorithm is proposed to distinguish rumour and non-rumor tweets. The proposed model is evaluated with freely accessible twitter datasets. The existing machine-based learning models have acquired an Fl-score of 0.46 to 0.76 for rumour detection, while the CNN model attained an Fl-score of 0.77 for rumour class. Overall, the CNN model yields greater results with a weighted average Fl-score of 0.84 for both rumour and non-rumor categories. The potential mechanism will help to detect misinformation as quickly as possible to counteract the dissemination of rumours and build users' deep confidence in social media sites.\",\"PeriodicalId\":347216,\"journal\":{\"name\":\"2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDGC50313.2020.9315738\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDGC50313.2020.9315738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

尤其是社交网站Twitter，已经成为八卦的热门场所。谣言或假新闻在不了解真相的情况下，通过推特网络很容易传播。这些报道引发民众困惑，威胁政府权威，对社会秩序构成重大威胁。尽快破除理论也是一项非常必要的工作。在本研究中，通过tweet检索多个描述性和基于消费者的特征，并将这些特征与TF-IDF系统集成，以开发一个复合特征集。这个特征的复合集然后被几种机器学习技术使用，如支持向量机(SVM)、线性回归、k近邻(KNN)、朴素贝叶斯、决策树、随机森林和梯度增强。与这些机器学习分类模型一起，提出了一种卷积神经网络(CNN)算法来区分谣言和非谣言推文。该模型用可自由访问的twitter数据集进行了评估。现有的基于机器的学习模型在谣言检测方面的fl得分为0.46 ~ 0.76，而CNN模型在谣言分类方面的fl得分为0.77。总体而言，CNN模型在谣言和非谣言类别的加权平均fl得分为0.84，结果更好。潜在的机制将有助于尽快发现错误信息，以抵消谣言的传播，并建立用户对社交媒体网站的深刻信心。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic Rumour Detection Model on Social Media

Social networking site Twitter, in particular, has become a popular spot for gossip. Rumors or false news spread very easily through the Twitter network by re-tweeting users without understanding the real truth. These reports trigger popular confusion, threaten the authority of the government and pose a major threat to social order. It is also a very necessary job to dispel theories as quickly as possible. In this research, multiple descriptive and consumer-based features via tweets are retrieved and integrated these features with the TF-IDF system to develop a composite set of features. This composite set of features is then used by several machine learning techniques like Support Vector Machine (SVM), Linear regression, K-Nearest Neighbor (KNN), Naive Bayes, Decision Tree, Random Forest, and Gradient Boosting. Along with these machine learning classification models, a Convolutional Neural Network (CNN) algorithm is proposed to distinguish rumour and non-rumor tweets. The proposed model is evaluated with freely accessible twitter datasets. The existing machine-based learning models have acquired an Fl-score of 0.46 to 0.76 for rumour detection, while the CNN model attained an Fl-score of 0.77 for rumour class. Overall, the CNN model yields greater results with a weighted average Fl-score of 0.84 for both rumour and non-rumor categories. The potential mechanism will help to detect misinformation as quickly as possible to counteract the dissemination of rumours and build users' deep confidence in social media sites.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC)

自引率

0.00%

发文量