A Machine Learning Model for Spam Reviews and Spammer Community Detection

Kiran P. Rangar, Atiya Khan
{"title":"A Machine Learning Model for Spam Reviews and Spammer Community Detection","authors":"Kiran P. Rangar, Atiya Khan","doi":"10.1109/AIC55036.2022.9848811","DOIUrl":null,"url":null,"abstract":"People’s choices to purchase a product are influenced by its internet ratings and recommendations. Spammers manipulate product sales by creating fraudulent ratings on online social media platforms. The majority of current research on online review has concentrated on supervised learning algorithms, which require labelled data.. This is an insufficient requirement for online review. In this piece, we will be concentrating on identifying any misleading text reviews that we come across. The goal of this study is to discover spam comments and spammer groups. Various spam detection strategies have been proposed in the literature, including Review-Linguistic (RL) based features, User-Behavioral (UB) based features, and Review-Behavioural (RB) based features, but none of them include a simultaneous detection of these characteristics and relative importance of the features while also defining the communication between spam users. The suggested work establishes a diverse network of users and feedback nodes, and then applies the spam detection methodology to the issue of the communication environment. A feature weighting approach is presented to determine the relative value of features. Our solution uses an attention mechanism to discover the spamming hints hidden within the material and determines the relevance of each word in the text by computing its weight. We used the CNN algorithm to classify the reviews and compared the results with the usual Naive Bayes and Support Vector Machine algorithms.","PeriodicalId":433590,"journal":{"name":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE World Conference on Applied Intelligence and Computing (AIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIC55036.2022.9848811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

People’s choices to purchase a product are influenced by its internet ratings and recommendations. Spammers manipulate product sales by creating fraudulent ratings on online social media platforms. The majority of current research on online review has concentrated on supervised learning algorithms, which require labelled data.. This is an insufficient requirement for online review. In this piece, we will be concentrating on identifying any misleading text reviews that we come across. The goal of this study is to discover spam comments and spammer groups. Various spam detection strategies have been proposed in the literature, including Review-Linguistic (RL) based features, User-Behavioral (UB) based features, and Review-Behavioural (RB) based features, but none of them include a simultaneous detection of these characteristics and relative importance of the features while also defining the communication between spam users. The suggested work establishes a diverse network of users and feedback nodes, and then applies the spam detection methodology to the issue of the communication environment. A feature weighting approach is presented to determine the relative value of features. Our solution uses an attention mechanism to discover the spamming hints hidden within the material and determines the relevance of each word in the text by computing its weight. We used the CNN algorithm to classify the reviews and compared the results with the usual Naive Bayes and Support Vector Machine algorithms.
垃圾邮件评论和垃圾邮件社区检测的机器学习模型
人们购买产品的选择受到其互联网评级和推荐的影响。垃圾邮件发送者通过在在线社交媒体平台上创建虚假评级来操纵产品销售。目前大多数关于在线评论的研究都集中在监督学习算法上,这需要标记数据。这是在线审查的不足要求。在这篇文章中,我们将专注于识别我们遇到的任何误导性的文本评论。本研究的目的是发现垃圾邮件评论和垃圾邮件发送者组。文献中已经提出了各种垃圾邮件检测策略,包括基于评论-语言(RL)的特征、基于用户行为(UB)的特征和基于评论-行为(RB)的特征,但它们都不包括同时检测这些特征和特征的相对重要性,同时也定义垃圾邮件用户之间的通信。建议建立一个多样化的用户和反馈节点网络,然后将垃圾邮件检测方法应用于通信环境问题。提出了一种特征加权方法来确定特征的相对值。我们的解决方案使用注意力机制来发现隐藏在材料中的垃圾提示,并通过计算其权重来确定文本中每个单词的相关性。我们使用CNN算法对评论进行分类,并将结果与常用的朴素贝叶斯和支持向量机算法进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信