Classification of Hoax News Using the Naïve Bayes Method

International Journal Software Engineering and Computer Science (IJSECS) Pub Date : 2024-04-01 DOI:10.35870/ijsecs.v4i1.2068

Rama Qubra, Rizal Adi Saputra

{"title":"Classification of Hoax News Using the Naïve Bayes Method","authors":"Rama Qubra, Rizal Adi Saputra","doi":"10.35870/ijsecs.v4i1.2068","DOIUrl":null,"url":null,"abstract":"The rampant dissemination of false and unsourced information, commonly known as hoaxes, has become a pervasive issue in the era of internet media. In the digital age, the widespread dissemination of false and unverified information has emerged as a critical concern within the realm of internet media. Hoax news can be used to influence elections, sway public opinion, and create political instability. The rapid evolution of information technology has contributed to the uncontrollable proliferation of hoax content, necessitating the development of intelligent systems for effective classification. This research focuses on implementing a robust classification system for identifying hoax news circulating through internet media. The method used in this program is the Naive Bayes method, specifically Naive Bayes Multinomial, which works with the assumption that each feature (word) is considered independent from the others. Text vectorization using CountVectorizer converts text into a numeric vector, which can be used by classification algorithms. This program uses a trained model to make predictions on testing data and calculate evaluation metrics such as accuracy, confusion matrix, and classification reports. By leveraging these methodologies, the study aims to enhance the accuracy and efficiency of distinguishing genuine news from deceptive hoaxes. The highest accuracy value obtained in this research was 94.73% with a division of 20% test data and 80% training data. True Negative (TN): 4555, False Positive (FP): 178 and False Negative (FN): 295, True Positive (TP): 3952","PeriodicalId":189392,"journal":{"name":"International Journal Software Engineering and Computer Science (IJSECS)","volume":"340 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal Software Engineering and Computer Science (IJSECS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35870/ijsecs.v4i1.2068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The rampant dissemination of false and unsourced information, commonly known as hoaxes, has become a pervasive issue in the era of internet media. In the digital age, the widespread dissemination of false and unverified information has emerged as a critical concern within the realm of internet media. Hoax news can be used to influence elections, sway public opinion, and create political instability. The rapid evolution of information technology has contributed to the uncontrollable proliferation of hoax content, necessitating the development of intelligent systems for effective classification. This research focuses on implementing a robust classification system for identifying hoax news circulating through internet media. The method used in this program is the Naive Bayes method, specifically Naive Bayes Multinomial, which works with the assumption that each feature (word) is considered independent from the others. Text vectorization using CountVectorizer converts text into a numeric vector, which can be used by classification algorithms. This program uses a trained model to make predictions on testing data and calculate evaluation metrics such as accuracy, confusion matrix, and classification reports. By leveraging these methodologies, the study aims to enhance the accuracy and efficiency of distinguishing genuine news from deceptive hoaxes. The highest accuracy value obtained in this research was 94.73% with a division of 20% test data and 80% training data. True Negative (TN): 4555, False Positive (FP): 178 and False Negative (FN): 295, True Positive (TP): 3952

查看原文本刊更多论文

使用奈夫贝叶斯方法对虚假新闻进行分类

在互联网媒体时代，虚假和未经证实的信息（俗称 "骗局"）的猖獗传播已成为一个普遍问题。在数字时代，未经核实的虚假信息的广泛传播已成为网络媒体领域的一个重要问题。虚假新闻可以用来影响选举、左右舆论和制造政治不稳定。信息技术的飞速发展导致了虚假内容不可控制的扩散，因此有必要开发智能系统来进行有效分类。本研究的重点是实施一个强大的分类系统，用于识别通过互联网媒体传播的恶作剧新闻。本程序中使用的方法是 Naive Bayes 方法，特别是 Naive Bayes Multinomial，其工作假设是每个特征（单词）都被视为独立于其他特征。使用 CountVectorizer 进行文本矢量化可将文本转换为数字矢量，供分类算法使用。该程序使用训练有素的模型对测试数据进行预测，并计算准确率、混淆矩阵和分类报告等评估指标。通过利用这些方法，本研究旨在提高区分真实新闻和欺骗性骗局的准确性和效率。本研究获得的最高准确率为 94.73%，测试数据占 20%，训练数据占 80%。真阴性 (TN): 4555，假阳性 (FP): 178，假阴性 (FN): 295，真阳性 (TP): 3952

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal Software Engineering and Computer Science (IJSECS)

自引率

0.00%

发文量