HINDI FAKE NEWS DETECTOR

IJISCS International Journal of Information System and Computer Science Pub Date : 2023-08-31 DOI:10.56327/ijiscs.v7i2.1465

A Siva Kumar, Adishree Solapurkar, Akshit Khamesra

{"title":"HINDI FAKE NEWS DETECTOR","authors":"A Siva Kumar, Adishree Solapurkar, Akshit Khamesra","doi":"10.56327/ijiscs.v7i2.1465","DOIUrl":null,"url":null,"abstract":"Fake news has proliferated on the internet in recent decades. More people than ever before are creating and sharing knowledge because of social networks, many of which have no connection to reality. This has led to the rapid dissemination of false information used for various political and business objectives. Finding reliable news sources has become more difficult due to online newspapers. In this work, we gathered news articles of Hindi text from various news sources. Techniques for pre-processing, feature extraction, classification, and prediction are all extensively covered. A “Fake news detection system” has been developed in this project. Various Hindi news articles have been collected from multiple sources to help diversify the dataset and train the model better. The project first pre-processes the dataset and uses the pre-trained Bert model for feature extraction. Then, the data is classified from the dataset and prediction processes are employed on the dataset. Various machine learning algorithms and deep learning models like Naïve Bayes, Long Short-Term Memory, Logistic Regression have been employed in previous works for the purpose of detecting fake news. Pre-processing steps include data cleaning, stop words removal, tokenizing, stemming. The testing and training of the dataset include using the BERT for sequence classification model. The model is trained and tested against the validation dataset","PeriodicalId":32370,"journal":{"name":"IJISCS International Journal of Information System and Computer Science","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJISCS International Journal of Information System and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.56327/ijiscs.v7i2.1465","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Fake news has proliferated on the internet in recent decades. More people than ever before are creating and sharing knowledge because of social networks, many of which have no connection to reality. This has led to the rapid dissemination of false information used for various political and business objectives. Finding reliable news sources has become more difficult due to online newspapers. In this work, we gathered news articles of Hindi text from various news sources. Techniques for pre-processing, feature extraction, classification, and prediction are all extensively covered. A “Fake news detection system” has been developed in this project. Various Hindi news articles have been collected from multiple sources to help diversify the dataset and train the model better. The project first pre-processes the dataset and uses the pre-trained Bert model for feature extraction. Then, the data is classified from the dataset and prediction processes are employed on the dataset. Various machine learning algorithms and deep learning models like Naïve Bayes, Long Short-Term Memory, Logistic Regression have been employed in previous works for the purpose of detecting fake news. Pre-processing steps include data cleaning, stop words removal, tokenizing, stemming. The testing and training of the dataset include using the BERT for sequence classification model. The model is trained and tested against the validation dataset

查看原文本刊更多论文

印度假新闻探测器

近几十年来，假新闻在互联网上激增。由于社交网络，比以往任何时候都有更多的人创造和分享知识，其中许多与现实没有联系。这导致了用于各种政治和商业目的的虚假信息的迅速传播。由于网络报纸的出现，寻找可靠的新闻来源变得更加困难。在这项工作中，我们从各种新闻来源收集了印度语文本的新闻文章。预处理、特征提取、分类和预测技术都被广泛地涵盖。在这个项目中开发了一个“假新闻检测系统”。从多个来源收集了各种印地语新闻文章，以帮助使数据集多样化并更好地训练模型。该项目首先对数据集进行预处理，并使用预训练的Bert模型进行特征提取。然后，对数据集进行分类，并对数据集进行预测处理。各种机器学习算法和深度学习模型，如Naïve贝叶斯，长短期记忆，逻辑回归在以前的工作中被用于检测假新闻。预处理步骤包括数据清理、停止词删除、标记化、词干提取。数据集的测试和训练包括使用BERT进行序列分类模型。针对验证数据集对模型进行训练和测试

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IJISCS International Journal of Information System and Computer Science

自引率

0.00%

发文量

审稿时长

12 weeks