使用深度学习揭穿多语言社交媒体帖子。

International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management Pub Date : 2023-06-04 DOI:10.1007/s41870-023-01288-6

Bina Kotiyal, Heman Pathak, Nipur Singh

{"title":"使用深度学习揭穿多语言社交媒体帖子。","authors":"Bina Kotiyal, Heman Pathak, Nipur Singh","doi":"10.1007/s41870-023-01288-6","DOIUrl":null,"url":null,"abstract":"Fake news on social media has become a growing concern due to its potential impact on shaping public opinion. The proposed Debunking Multi-Lingual Social Media Posts using Deep Learning (DSMPD) approach offers a promising solution to detect fake news. The DSMPD approach involves creating a dataset of English and Hindi social media posts using web scraping and Natural Language Processing (NLP) techniques. This dataset is then used to train, test, and validate a deep learning-based model that extracts various features, including Embedding from Language Models (ELMo), word and n-gram counts, Term Frequency-Inverse Document Frequency (TF-IDF), sentiments, polarity, and Named Entity Recognition (NER). Based on these features, the model classifies news items into five categories: real, could be real, could be fabricated, fabricated, or dangerously fabricated. To evaluate the performance of the classifiers, the researchers used two datasets comprising over 45,000 articles. Machine learning (ML) algorithms and Deep learning (DL) model are compared to choose the best option for classification and prediction.","PeriodicalId":73455,"journal":{"name":"International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management","volume":" ","pages":"1-13"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10239612/pdf/","citationCount":"4","resultStr":"{\"title\":\"Debunking multi-lingual social media posts using deep learning.\",\"authors\":\"Bina Kotiyal, Heman Pathak, Nipur Singh\",\"doi\":\"10.1007/s41870-023-01288-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fake news on social media has become a growing concern due to its potential impact on shaping public opinion. The proposed Debunking Multi-Lingual Social Media Posts using Deep Learning (DSMPD) approach offers a promising solution to detect fake news. The DSMPD approach involves creating a dataset of English and Hindi social media posts using web scraping and Natural Language Processing (NLP) techniques. This dataset is then used to train, test, and validate a deep learning-based model that extracts various features, including Embedding from Language Models (ELMo), word and n-gram counts, Term Frequency-Inverse Document Frequency (TF-IDF), sentiments, polarity, and Named Entity Recognition (NER). Based on these features, the model classifies news items into five categories: real, could be real, could be fabricated, fabricated, or dangerously fabricated. To evaluate the performance of the classifiers, the researchers used two datasets comprising over 45,000 articles. Machine learning (ML) algorithms and Deep learning (DL) model are compared to choose the best option for classification and prediction.\",\"PeriodicalId\":73455,\"journal\":{\"name\":\"International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management\",\"volume\":\" \",\"pages\":\"1-13\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10239612/pdf/\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s41870-023-01288-6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41870-023-01288-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

社交媒体上的假新闻由于其对塑造公众舆论的潜在影响而日益受到关注。提出的使用深度学习(DSMPD)方法揭穿多语言社交媒体帖子的方法为检测假新闻提供了一个有前途的解决方案。DSMPD方法包括使用网络抓取和自然语言处理(NLP)技术创建英语和印地语社交媒体帖子的数据集。然后使用该数据集来训练、测试和验证基于深度学习的模型，该模型提取各种特征，包括从语言模型中嵌入(ELMo)、单词和n-gram计数、术语频率-逆文档频率(TF-IDF)、情感、极性和命名实体识别(NER)。基于这些特征，该模型将新闻条目分为五类:真实的、可能真实的、可能捏造的、捏造的和危险捏造的。为了评估分类器的性能，研究人员使用了包含超过45,000篇文章的两个数据集。比较机器学习(ML)算法和深度学习(DL)模型，选择分类和预测的最佳选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Debunking multi-lingual social media posts using deep learning.

查看原文本刊更多论文

Debunking multi-lingual social media posts using deep learning.

Fake news on social media has become a growing concern due to its potential impact on shaping public opinion. The proposed Debunking Multi-Lingual Social Media Posts using Deep Learning (DSMPD) approach offers a promising solution to detect fake news. The DSMPD approach involves creating a dataset of English and Hindi social media posts using web scraping and Natural Language Processing (NLP) techniques. This dataset is then used to train, test, and validate a deep learning-based model that extracts various features, including Embedding from Language Models (ELMo), word and n-gram counts, Term Frequency-Inverse Document Frequency (TF-IDF), sentiments, polarity, and Named Entity Recognition (NER). Based on these features, the model classifies news items into five categories: real, could be real, could be fabricated, fabricated, or dangerously fabricated. To evaluate the performance of the classifiers, the researchers used two datasets comprising over 45,000 articles. Machine learning (ML) algorithms and Deep learning (DL) model are compared to choose the best option for classification and prediction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International journal of information technology : an official journal of Bharati Vidyapeeth's Institute of Computer Applications and Management

自引率

0.00%

发文量