Classification of Hoax News Using the Naïve Bayes Method

Rama Qubra, Rizal Adi Saputra
{"title":"Classification of Hoax News Using the Naïve Bayes Method","authors":"Rama Qubra, Rizal Adi Saputra","doi":"10.35870/ijsecs.v4i1.2068","DOIUrl":null,"url":null,"abstract":"The rampant dissemination of false and unsourced information, commonly known as hoaxes, has become a pervasive issue in the era of internet media. In the digital age, the widespread dissemination of false and unverified information has emerged as a critical concern within the realm of internet media. Hoax news can be used to influence elections, sway public opinion, and create political instability. The rapid evolution of information technology has contributed to the uncontrollable proliferation of hoax content, necessitating the development of intelligent systems for effective classification. This research focuses on implementing a robust classification system for identifying hoax news circulating through internet media. The method used in this program is the Naive Bayes method, specifically Naive Bayes Multinomial, which works with the assumption that each feature (word) is considered independent from the others. Text vectorization using CountVectorizer converts text into a numeric vector, which can be used by classification algorithms. This program uses a trained model to make predictions on testing data and calculate evaluation metrics such as accuracy, confusion matrix, and classification reports. By leveraging these methodologies, the study aims to enhance the accuracy and efficiency of distinguishing genuine news from deceptive hoaxes. The highest accuracy value obtained in this research was 94.73% with a division of 20% test data and 80% training data. True Negative (TN): 4555, False Positive (FP): 178 and False Negative (FN): 295, True Positive (TP): 3952","PeriodicalId":189392,"journal":{"name":"International Journal Software Engineering and Computer Science (IJSECS)","volume":"340 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal Software Engineering and Computer Science (IJSECS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35870/ijsecs.v4i1.2068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The rampant dissemination of false and unsourced information, commonly known as hoaxes, has become a pervasive issue in the era of internet media. In the digital age, the widespread dissemination of false and unverified information has emerged as a critical concern within the realm of internet media. Hoax news can be used to influence elections, sway public opinion, and create political instability. The rapid evolution of information technology has contributed to the uncontrollable proliferation of hoax content, necessitating the development of intelligent systems for effective classification. This research focuses on implementing a robust classification system for identifying hoax news circulating through internet media. The method used in this program is the Naive Bayes method, specifically Naive Bayes Multinomial, which works with the assumption that each feature (word) is considered independent from the others. Text vectorization using CountVectorizer converts text into a numeric vector, which can be used by classification algorithms. This program uses a trained model to make predictions on testing data and calculate evaluation metrics such as accuracy, confusion matrix, and classification reports. By leveraging these methodologies, the study aims to enhance the accuracy and efficiency of distinguishing genuine news from deceptive hoaxes. The highest accuracy value obtained in this research was 94.73% with a division of 20% test data and 80% training data. True Negative (TN): 4555, False Positive (FP): 178 and False Negative (FN): 295, True Positive (TP): 3952
使用奈夫贝叶斯方法对虚假新闻进行分类
在互联网媒体时代,虚假和未经证实的信息(俗称 "骗局")的猖獗传播已成为一个普遍问题。在数字时代,未经核实的虚假信息的广泛传播已成为网络媒体领域的一个重要问题。虚假新闻可以用来影响选举、左右舆论和制造政治不稳定。信息技术的飞速发展导致了虚假内容不可控制的扩散,因此有必要开发智能系统来进行有效分类。本研究的重点是实施一个强大的分类系统,用于识别通过互联网媒体传播的恶作剧新闻。本程序中使用的方法是 Naive Bayes 方法,特别是 Naive Bayes Multinomial,其工作假设是每个特征(单词)都被视为独立于其他特征。使用 CountVectorizer 进行文本矢量化可将文本转换为数字矢量,供分类算法使用。该程序使用训练有素的模型对测试数据进行预测,并计算准确率、混淆矩阵和分类报告等评估指标。通过利用这些方法,本研究旨在提高区分真实新闻和欺骗性骗局的准确性和效率。本研究获得的最高准确率为 94.73%,测试数据占 20%,训练数据占 80%。真阴性 (TN): 4555,假阳性 (FP): 178,假阴性 (FN): 295,真阳性 (TP): 3952
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信