使用深度学习和自然语言处理的假新闻检测

Anand Matheven, B. V. D. Kumar
{"title":"使用深度学习和自然语言处理的假新闻检测","authors":"Anand Matheven, B. V. D. Kumar","doi":"10.1109/ISCMI56532.2022.10068440","DOIUrl":null,"url":null,"abstract":"The rise of social media has brought the rise of fake news and this fake news comes with negative consequences. With fake news being such a huge issue, efforts should be made to identify any forms of fake news however it is not so simple. Manually identifying fake news can be extremely subjective as determining the accuracy of the information in a story is complex and difficult to perform, even for experts. On the other hand, an automated solution would require a good understanding of NLP which is also complex and may have difficulties producing an accurate output. Therefore, the main problem focused on this project is the viability of developing a system that can effectively and accurately detect and identify fake news. Finding a solution would be a significant benefit to the media industry, particularly the social media industry as this is where a large proportion of fake news is published and spread. In order to find a solution to this problem, this project proposed the development of a fake news identification system using deep learning and natural language processing. The system was developed using a Word2vec model combined with a Long Short-Term Memory model in order to showcase the compatibility of the two models in a whole system. This system was trained and tested using two different dataset collections that each consisted of one real news dataset and one fake news dataset. Furthermore, three independent variables were chosen which were the number of training cycles, data diversity and vector size to analyze the relationship between these variables and the accuracy levels of the system. It was found that these three variables did have a significant effect on the accuracy of the system. From this, the system was then trained and tested with the optimal variables and was able to achieve the minimum expected accuracy level of 90%. The achieving of this accuracy levels confirms the compatibility of the LSTM and Word2vec model and their capability to be synergized into a single system that is able to identify fake news with a high level of accuracy.","PeriodicalId":340397,"journal":{"name":"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fake News Detection Using Deep Learning and Natural Language Processing\",\"authors\":\"Anand Matheven, B. V. D. Kumar\",\"doi\":\"10.1109/ISCMI56532.2022.10068440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rise of social media has brought the rise of fake news and this fake news comes with negative consequences. With fake news being such a huge issue, efforts should be made to identify any forms of fake news however it is not so simple. Manually identifying fake news can be extremely subjective as determining the accuracy of the information in a story is complex and difficult to perform, even for experts. On the other hand, an automated solution would require a good understanding of NLP which is also complex and may have difficulties producing an accurate output. Therefore, the main problem focused on this project is the viability of developing a system that can effectively and accurately detect and identify fake news. Finding a solution would be a significant benefit to the media industry, particularly the social media industry as this is where a large proportion of fake news is published and spread. In order to find a solution to this problem, this project proposed the development of a fake news identification system using deep learning and natural language processing. The system was developed using a Word2vec model combined with a Long Short-Term Memory model in order to showcase the compatibility of the two models in a whole system. This system was trained and tested using two different dataset collections that each consisted of one real news dataset and one fake news dataset. Furthermore, three independent variables were chosen which were the number of training cycles, data diversity and vector size to analyze the relationship between these variables and the accuracy levels of the system. It was found that these three variables did have a significant effect on the accuracy of the system. From this, the system was then trained and tested with the optimal variables and was able to achieve the minimum expected accuracy level of 90%. The achieving of this accuracy levels confirms the compatibility of the LSTM and Word2vec model and their capability to be synergized into a single system that is able to identify fake news with a high level of accuracy.\",\"PeriodicalId\":340397,\"journal\":{\"name\":\"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCMI56532.2022.10068440\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCMI56532.2022.10068440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

社交媒体的兴起带来了假新闻的兴起,这些假新闻带来了负面后果。假新闻是一个巨大的问题,应该努力识别任何形式的假新闻,但这并不是那么简单。人工识别假新闻可能非常主观,因为确定新闻中信息的准确性非常复杂,即使是专家也很难做到。另一方面,自动化解决方案需要对NLP有很好的理解,这也很复杂,可能难以产生准确的输出。因此,本项目关注的主要问题是开发一个能够有效准确地检测和识别假新闻的系统的可行性。找到一个解决方案将对媒体行业,特别是社交媒体行业大有裨益,因为这是很大一部分假新闻发布和传播的地方。为了解决这一问题,本项目提出利用深度学习和自然语言处理技术开发假新闻识别系统。本系统采用Word2vec模型与长短期记忆模型相结合的方式开发,以展示两种模型在整个系统中的兼容性。该系统使用两个不同的数据集集进行训练和测试,每个数据集集由一个真实新闻数据集和一个假新闻数据集组成。进一步,选取训练周期数、数据多样性和向量大小三个自变量,分析这些变量与系统准确率水平之间的关系。结果表明,这三个变量对系统的精度有显著影响。然后,用最优变量对系统进行训练和测试,并能够达到90%的最低预期精度水平。这种准确性水平的实现证实了LSTM和Word2vec模型的兼容性,以及它们协同成一个能够以高准确性识别假新闻的单一系统的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fake News Detection Using Deep Learning and Natural Language Processing
The rise of social media has brought the rise of fake news and this fake news comes with negative consequences. With fake news being such a huge issue, efforts should be made to identify any forms of fake news however it is not so simple. Manually identifying fake news can be extremely subjective as determining the accuracy of the information in a story is complex and difficult to perform, even for experts. On the other hand, an automated solution would require a good understanding of NLP which is also complex and may have difficulties producing an accurate output. Therefore, the main problem focused on this project is the viability of developing a system that can effectively and accurately detect and identify fake news. Finding a solution would be a significant benefit to the media industry, particularly the social media industry as this is where a large proportion of fake news is published and spread. In order to find a solution to this problem, this project proposed the development of a fake news identification system using deep learning and natural language processing. The system was developed using a Word2vec model combined with a Long Short-Term Memory model in order to showcase the compatibility of the two models in a whole system. This system was trained and tested using two different dataset collections that each consisted of one real news dataset and one fake news dataset. Furthermore, three independent variables were chosen which were the number of training cycles, data diversity and vector size to analyze the relationship between these variables and the accuracy levels of the system. It was found that these three variables did have a significant effect on the accuracy of the system. From this, the system was then trained and tested with the optimal variables and was able to achieve the minimum expected accuracy level of 90%. The achieving of this accuracy levels confirms the compatibility of the LSTM and Word2vec model and their capability to be synergized into a single system that is able to identify fake news with a high level of accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信