基于bert的社交媒体欺骗性评论检测：引入欺骗性评论

IF 4.5 2区计算机科学 Q1 COMPUTER SCIENCE, CYBERNETICS

IEEE Transactions on Computational Social Systems Pub Date : 2024-07-03 DOI:10.1109/TCSS.2024.3403937

Syeda Basmah Hyder;Noshina Tariq;Syed Atif Moqurrab;Muhammad Ashraf;Joon Yoo;Gautam Srivastava

{"title":"基于bert的社交媒体欺骗性评论检测：引入欺骗性评论","authors":"Syeda Basmah Hyder;Noshina Tariq;Syed Atif Moqurrab;Muhammad Ashraf;Joon Yoo;Gautam Srivastava","doi":"10.1109/TCSS.2024.3403937","DOIUrl":null,"url":null,"abstract":"In recent years, the Internet has facilitated the emergence of social media platforms as significant channels for individuals to express their thoughts and engage in instantaneous interactions. However, the reliance on online reviews has also given rise to deceptive practices, where anonymous spammers generate fake reviews to manipulate the perception of a product. Ensuring the integrity of the online review system requires identifying and mitigating fake reviews. While existing machine learning (ML)- and neural network (NN)-based sentiment analysis methods can detect deceptive reviews, they often suffer from long training times, high computational resource requirements, and memory constraints. This study aims to overcome these limitations by introducing a transformer-based “deceptive bidirectional encoder representations from transformers (DeceptiveBERT) model.” This model utilizes contextual representations to enhance the precision of deceptive review identification. Transfer learning is employed to leverage knowledge from a pre-existing BERT base-uncased word embedding model, enabling efficient feature extraction. The proposed model incorporates a combination of classification layers to categorize reviews into two distinct categories: deceptive and truthful. Additionally, the study addresses the challenge of imbalanced datasets by utilizing three separate datasets and implementing appropriate methodologies for dataset curation. The effectiveness of the DeceptiveBERT model was evaluated through experimentation. The results demonstrate its efficacy, with the model achieving accuracy rates of 75%, 84.79%, and 81.08% on the Ott, YelpNYC, and YelpZip datasets, respectively.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"11 6","pages":"7234-7243"},"PeriodicalIF":4.5000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BERT-Based Deceptive Review Detection in Social Media: Introducing DeceptiveBERT\",\"authors\":\"Syeda Basmah Hyder;Noshina Tariq;Syed Atif Moqurrab;Muhammad Ashraf;Joon Yoo;Gautam Srivastava\",\"doi\":\"10.1109/TCSS.2024.3403937\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the Internet has facilitated the emergence of social media platforms as significant channels for individuals to express their thoughts and engage in instantaneous interactions. However, the reliance on online reviews has also given rise to deceptive practices, where anonymous spammers generate fake reviews to manipulate the perception of a product. Ensuring the integrity of the online review system requires identifying and mitigating fake reviews. While existing machine learning (ML)- and neural network (NN)-based sentiment analysis methods can detect deceptive reviews, they often suffer from long training times, high computational resource requirements, and memory constraints. This study aims to overcome these limitations by introducing a transformer-based “deceptive bidirectional encoder representations from transformers (DeceptiveBERT) model.” This model utilizes contextual representations to enhance the precision of deceptive review identification. Transfer learning is employed to leverage knowledge from a pre-existing BERT base-uncased word embedding model, enabling efficient feature extraction. The proposed model incorporates a combination of classification layers to categorize reviews into two distinct categories: deceptive and truthful. Additionally, the study addresses the challenge of imbalanced datasets by utilizing three separate datasets and implementing appropriate methodologies for dataset curation. The effectiveness of the DeceptiveBERT model was evaluated through experimentation. The results demonstrate its efficacy, with the model achieving accuracy rates of 75%, 84.79%, and 81.08% on the Ott, YelpNYC, and YelpZip datasets, respectively.\",\"PeriodicalId\":13044,\"journal\":{\"name\":\"IEEE Transactions on Computational Social Systems\",\"volume\":\"11 6\",\"pages\":\"7234-7243\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2024-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computational Social Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10584138/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, CYBERNETICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10584138/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，互联网促进了社交媒体平台的出现，成为个人表达思想、进行即时互动的重要渠道。然而，对在线评论的依赖也引发了欺诈行为，匿名垃圾邮件发送者生成虚假评论，以操纵人们对产品的看法。确保在线评论系统的完整性需要识别和减少虚假评论。虽然现有的基于机器学习（ML）和神经网络（NN）的情感分析方法可以检测欺骗性评论，但它们通常受到训练时间长、计算资源需求高和内存限制的影响。本研究旨在通过引入基于变压器的“来自变压器的欺骗性双向编码器表示（欺骗性双向编码器表示）模型”来克服这些限制。该模型利用上下文表示来提高欺骗性评论识别的精度。迁移学习利用已有的BERT基础-无大小写词嵌入模型中的知识，实现高效的特征提取。提出的模型结合了分类层的组合，将评论分为两个不同的类别：欺骗性和真实性。此外，该研究通过利用三个独立的数据集和实施适当的数据集管理方法来解决不平衡数据集的挑战。通过实验评估了欺骗伯特模型的有效性。结果表明，该模型在Ott、YelpNYC和YelpZip数据集上的准确率分别达到75%、84.79%和81.08%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BERT-Based Deceptive Review Detection in Social Media: Introducing DeceptiveBERT

In recent years, the Internet has facilitated the emergence of social media platforms as significant channels for individuals to express their thoughts and engage in instantaneous interactions. However, the reliance on online reviews has also given rise to deceptive practices, where anonymous spammers generate fake reviews to manipulate the perception of a product. Ensuring the integrity of the online review system requires identifying and mitigating fake reviews. While existing machine learning (ML)- and neural network (NN)-based sentiment analysis methods can detect deceptive reviews, they often suffer from long training times, high computational resource requirements, and memory constraints. This study aims to overcome these limitations by introducing a transformer-based “deceptive bidirectional encoder representations from transformers (DeceptiveBERT) model.” This model utilizes contextual representations to enhance the precision of deceptive review identification. Transfer learning is employed to leverage knowledge from a pre-existing BERT base-uncased word embedding model, enabling efficient feature extraction. The proposed model incorporates a combination of classification layers to categorize reviews into two distinct categories: deceptive and truthful. Additionally, the study addresses the challenge of imbalanced datasets by utilizing three separate datasets and implementing appropriate methodologies for dataset curation. The effectiveness of the DeceptiveBERT model was evaluated through experimentation. The results demonstrate its efficacy, with the model achieving accuracy rates of 75%, 84.79%, and 81.08% on the Ott, YelpNYC, and YelpZip datasets, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Computational Social Systems Social Sciences-Social Sciences (miscellaneous)

CiteScore

10.00

自引率

20.00%

发文量

316

期刊介绍： IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.