使用深度学习和自然语言处理的假新闻检测

2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI) Pub Date : 2022-11-26 DOI:10.1109/ISCMI56532.2022.10068440

Anand Matheven, B. V. D. Kumar

{"title":"使用深度学习和自然语言处理的假新闻检测","authors":"Anand Matheven, B. V. D. Kumar","doi":"10.1109/ISCMI56532.2022.10068440","DOIUrl":null,"url":null,"abstract":"The rise of social media has brought the rise of fake news and this fake news comes with negative consequences. With fake news being such a huge issue, efforts should be made to identify any forms of fake news however it is not so simple. Manually identifying fake news can be extremely subjective as determining the accuracy of the information in a story is complex and difficult to perform, even for experts. On the other hand, an automated solution would require a good understanding of NLP which is also complex and may have difficulties producing an accurate output. Therefore, the main problem focused on this project is the viability of developing a system that can effectively and accurately detect and identify fake news. Finding a solution would be a significant benefit to the media industry, particularly the social media industry as this is where a large proportion of fake news is published and spread. In order to find a solution to this problem, this project proposed the development of a fake news identification system using deep learning and natural language processing. The system was developed using a Word2vec model combined with a Long Short-Term Memory model in order to showcase the compatibility of the two models in a whole system. This system was trained and tested using two different dataset collections that each consisted of one real news dataset and one fake news dataset. Furthermore, three independent variables were chosen which were the number of training cycles, data diversity and vector size to analyze the relationship between these variables and the accuracy levels of the system. It was found that these three variables did have a significant effect on the accuracy of the system. From this, the system was then trained and tested with the optimal variables and was able to achieve the minimum expected accuracy level of 90%. The achieving of this accuracy levels confirms the compatibility of the LSTM and Word2vec model and their capability to be synergized into a single system that is able to identify fake news with a high level of accuracy.","PeriodicalId":340397,"journal":{"name":"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fake News Detection Using Deep Learning and Natural Language Processing\",\"authors\":\"Anand Matheven, B. V. D. Kumar\",\"doi\":\"10.1109/ISCMI56532.2022.10068440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rise of social media has brought the rise of fake news and this fake news comes with negative consequences. With fake news being such a huge issue, efforts should be made to identify any forms of fake news however it is not so simple. Manually identifying fake news can be extremely subjective as determining the accuracy of the information in a story is complex and difficult to perform, even for experts. On the other hand, an automated solution would require a good understanding of NLP which is also complex and may have difficulties producing an accurate output. Therefore, the main problem focused on this project is the viability of developing a system that can effectively and accurately detect and identify fake news. Finding a solution would be a significant benefit to the media industry, particularly the social media industry as this is where a large proportion of fake news is published and spread. In order to find a solution to this problem, this project proposed the development of a fake news identification system using deep learning and natural language processing. The system was developed using a Word2vec model combined with a Long Short-Term Memory model in order to showcase the compatibility of the two models in a whole system. This system was trained and tested using two different dataset collections that each consisted of one real news dataset and one fake news dataset. Furthermore, three independent variables were chosen which were the number of training cycles, data diversity and vector size to analyze the relationship between these variables and the accuracy levels of the system. It was found that these three variables did have a significant effect on the accuracy of the system. From this, the system was then trained and tested with the optimal variables and was able to achieve the minimum expected accuracy level of 90%. The achieving of this accuracy levels confirms the compatibility of the LSTM and Word2vec model and their capability to be synergized into a single system that is able to identify fake news with a high level of accuracy.\",\"PeriodicalId\":340397,\"journal\":{\"name\":\"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCMI56532.2022.10068440\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCMI56532.2022.10068440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

社交媒体的兴起带来了假新闻的兴起，这些假新闻带来了负面后果。假新闻是一个巨大的问题，应该努力识别任何形式的假新闻，但这并不是那么简单。人工识别假新闻可能非常主观，因为确定新闻中信息的准确性非常复杂，即使是专家也很难做到。另一方面，自动化解决方案需要对NLP有很好的理解，这也很复杂，可能难以产生准确的输出。因此，本项目关注的主要问题是开发一个能够有效准确地检测和识别假新闻的系统的可行性。找到一个解决方案将对媒体行业，特别是社交媒体行业大有裨益，因为这是很大一部分假新闻发布和传播的地方。为了解决这一问题，本项目提出利用深度学习和自然语言处理技术开发假新闻识别系统。本系统采用Word2vec模型与长短期记忆模型相结合的方式开发，以展示两种模型在整个系统中的兼容性。该系统使用两个不同的数据集集进行训练和测试，每个数据集集由一个真实新闻数据集和一个假新闻数据集组成。进一步，选取训练周期数、数据多样性和向量大小三个自变量，分析这些变量与系统准确率水平之间的关系。结果表明，这三个变量对系统的精度有显著影响。然后，用最优变量对系统进行训练和测试，并能够达到90%的最低预期精度水平。这种准确性水平的实现证实了LSTM和Word2vec模型的兼容性，以及它们协同成一个能够以高准确性识别假新闻的单一系统的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fake News Detection Using Deep Learning and Natural Language Processing

The rise of social media has brought the rise of fake news and this fake news comes with negative consequences. With fake news being such a huge issue, efforts should be made to identify any forms of fake news however it is not so simple. Manually identifying fake news can be extremely subjective as determining the accuracy of the information in a story is complex and difficult to perform, even for experts. On the other hand, an automated solution would require a good understanding of NLP which is also complex and may have difficulties producing an accurate output. Therefore, the main problem focused on this project is the viability of developing a system that can effectively and accurately detect and identify fake news. Finding a solution would be a significant benefit to the media industry, particularly the social media industry as this is where a large proportion of fake news is published and spread. In order to find a solution to this problem, this project proposed the development of a fake news identification system using deep learning and natural language processing. The system was developed using a Word2vec model combined with a Long Short-Term Memory model in order to showcase the compatibility of the two models in a whole system. This system was trained and tested using two different dataset collections that each consisted of one real news dataset and one fake news dataset. Furthermore, three independent variables were chosen which were the number of training cycles, data diversity and vector size to analyze the relationship between these variables and the accuracy levels of the system. It was found that these three variables did have a significant effect on the accuracy of the system. From this, the system was then trained and tested with the optimal variables and was able to achieve the minimum expected accuracy level of 90%. The achieving of this accuracy levels confirms the compatibility of the LSTM and Word2vec model and their capability to be synergized into a single system that is able to identify fake news with a high level of accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI)

自引率

0.00%

发文量