监督学习对评论垃圾邮件检测的实证分析

Faisal Khurshid, Yan Zhu, Chubato Wondaferaw Yohannese, M. Iqbal
{"title":"监督学习对评论垃圾邮件检测的实证分析","authors":"Faisal Khurshid, Yan Zhu, Chubato Wondaferaw Yohannese, M. Iqbal","doi":"10.1109/ISKE.2017.8258755","DOIUrl":null,"url":null,"abstract":"Online purchasing became an integral part of our lives in this digital era where E-commerce websites allow people to buy as well as share their experiences about products or services in the form of reviews. Customers as well as companies use these reviews for decision making. This facility helps people to derive their buying decisions whereas malicious users use this as their tool to promote or demote products or services intentionally. This phenomenon is called review spam. Review spam detection is the classification of reviews into malign or benign. Therefore, our aim is to evaluate performance of supervised machine learning algorithms for review spam detection based on different feature sets extracted from real life dataset instead of Amazon Mechanical Turkers (AMT) tailored dataset. We study various factors including Recall, Precision, and Receiver Operating Characteristic (ROC) through experimentation. AdaBoost outperforms all others with 0.83 precision and has correctly identified all spams whereas misclassified minuscule number of normal reviews.","PeriodicalId":208009,"journal":{"name":"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Recital of supervised learning on review spam detection: An empirical analysis\",\"authors\":\"Faisal Khurshid, Yan Zhu, Chubato Wondaferaw Yohannese, M. Iqbal\",\"doi\":\"10.1109/ISKE.2017.8258755\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online purchasing became an integral part of our lives in this digital era where E-commerce websites allow people to buy as well as share their experiences about products or services in the form of reviews. Customers as well as companies use these reviews for decision making. This facility helps people to derive their buying decisions whereas malicious users use this as their tool to promote or demote products or services intentionally. This phenomenon is called review spam. Review spam detection is the classification of reviews into malign or benign. Therefore, our aim is to evaluate performance of supervised machine learning algorithms for review spam detection based on different feature sets extracted from real life dataset instead of Amazon Mechanical Turkers (AMT) tailored dataset. We study various factors including Recall, Precision, and Receiver Operating Characteristic (ROC) through experimentation. AdaBoost outperforms all others with 0.83 precision and has correctly identified all spams whereas misclassified minuscule number of normal reviews.\",\"PeriodicalId\":208009,\"journal\":{\"name\":\"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISKE.2017.8258755\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISKE.2017.8258755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

在这个数字时代,网上购物成为我们生活中不可或缺的一部分,电子商务网站允许人们以评论的形式购买和分享他们对产品或服务的体验。客户和公司都使用这些评论来做决策。这个工具帮助人们得出他们的购买决定,而恶意的用户使用它作为他们的工具来推销或贬低产品或服务。这种现象被称为评论垃圾邮件。评论垃圾检测是将评论分类为恶性或良性。因此,我们的目标是评估监督机器学习算法的性能,该算法基于从现实生活数据集提取的不同特征集,而不是从Amazon Mechanical Turkers (AMT)定制数据集中提取的评论垃圾邮件检测。我们通过实验研究了召回率、精确率和受试者工作特性(ROC)等因素。AdaBoost以0.83的准确率超过了其他所有工具,并且正确识别了所有垃圾邮件,而错误分类了少量正常评论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Recital of supervised learning on review spam detection: An empirical analysis
Online purchasing became an integral part of our lives in this digital era where E-commerce websites allow people to buy as well as share their experiences about products or services in the form of reviews. Customers as well as companies use these reviews for decision making. This facility helps people to derive their buying decisions whereas malicious users use this as their tool to promote or demote products or services intentionally. This phenomenon is called review spam. Review spam detection is the classification of reviews into malign or benign. Therefore, our aim is to evaluate performance of supervised machine learning algorithms for review spam detection based on different feature sets extracted from real life dataset instead of Amazon Mechanical Turkers (AMT) tailored dataset. We study various factors including Recall, Precision, and Receiver Operating Characteristic (ROC) through experimentation. AdaBoost outperforms all others with 0.83 precision and has correctly identified all spams whereas misclassified minuscule number of normal reviews.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信