监督学习对评论垃圾邮件检测的实证分析

2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE) Pub Date : 2017-11-01 DOI:10.1109/ISKE.2017.8258755

Faisal Khurshid, Yan Zhu, Chubato Wondaferaw Yohannese, M. Iqbal

{"title":"监督学习对评论垃圾邮件检测的实证分析","authors":"Faisal Khurshid, Yan Zhu, Chubato Wondaferaw Yohannese, M. Iqbal","doi":"10.1109/ISKE.2017.8258755","DOIUrl":null,"url":null,"abstract":"Online purchasing became an integral part of our lives in this digital era where E-commerce websites allow people to buy as well as share their experiences about products or services in the form of reviews. Customers as well as companies use these reviews for decision making. This facility helps people to derive their buying decisions whereas malicious users use this as their tool to promote or demote products or services intentionally. This phenomenon is called review spam. Review spam detection is the classification of reviews into malign or benign. Therefore, our aim is to evaluate performance of supervised machine learning algorithms for review spam detection based on different feature sets extracted from real life dataset instead of Amazon Mechanical Turkers (AMT) tailored dataset. We study various factors including Recall, Precision, and Receiver Operating Characteristic (ROC) through experimentation. AdaBoost outperforms all others with 0.83 precision and has correctly identified all spams whereas misclassified minuscule number of normal reviews.","PeriodicalId":208009,"journal":{"name":"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Recital of supervised learning on review spam detection: An empirical analysis\",\"authors\":\"Faisal Khurshid, Yan Zhu, Chubato Wondaferaw Yohannese, M. Iqbal\",\"doi\":\"10.1109/ISKE.2017.8258755\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online purchasing became an integral part of our lives in this digital era where E-commerce websites allow people to buy as well as share their experiences about products or services in the form of reviews. Customers as well as companies use these reviews for decision making. This facility helps people to derive their buying decisions whereas malicious users use this as their tool to promote or demote products or services intentionally. This phenomenon is called review spam. Review spam detection is the classification of reviews into malign or benign. Therefore, our aim is to evaluate performance of supervised machine learning algorithms for review spam detection based on different feature sets extracted from real life dataset instead of Amazon Mechanical Turkers (AMT) tailored dataset. We study various factors including Recall, Precision, and Receiver Operating Characteristic (ROC) through experimentation. AdaBoost outperforms all others with 0.83 precision and has correctly identified all spams whereas misclassified minuscule number of normal reviews.\",\"PeriodicalId\":208009,\"journal\":{\"name\":\"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISKE.2017.8258755\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISKE.2017.8258755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

在这个数字时代，网上购物成为我们生活中不可或缺的一部分，电子商务网站允许人们以评论的形式购买和分享他们对产品或服务的体验。客户和公司都使用这些评论来做决策。这个工具帮助人们得出他们的购买决定，而恶意的用户使用它作为他们的工具来推销或贬低产品或服务。这种现象被称为评论垃圾邮件。评论垃圾检测是将评论分类为恶性或良性。因此，我们的目标是评估监督机器学习算法的性能，该算法基于从现实生活数据集提取的不同特征集，而不是从Amazon Mechanical Turkers (AMT)定制数据集中提取的评论垃圾邮件检测。我们通过实验研究了召回率、精确率和受试者工作特性(ROC)等因素。AdaBoost以0.83的准确率超过了其他所有工具，并且正确识别了所有垃圾邮件，而错误分类了少量正常评论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Recital of supervised learning on review spam detection: An empirical analysis

Online purchasing became an integral part of our lives in this digital era where E-commerce websites allow people to buy as well as share their experiences about products or services in the form of reviews. Customers as well as companies use these reviews for decision making. This facility helps people to derive their buying decisions whereas malicious users use this as their tool to promote or demote products or services intentionally. This phenomenon is called review spam. Review spam detection is the classification of reviews into malign or benign. Therefore, our aim is to evaluate performance of supervised machine learning algorithms for review spam detection based on different feature sets extracted from real life dataset instead of Amazon Mechanical Turkers (AMT) tailored dataset. We study various factors including Recall, Precision, and Receiver Operating Characteristic (ROC) through experimentation. AdaBoost outperforms all others with 0.83 precision and has correctly identified all spams whereas misclassified minuscule number of normal reviews.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)

自引率

0.00%

发文量