Deceptive Opinion Detection Using Machine Learning Techniques

International Journal of Information Engineering and Electronic Business Pub Date : 2020-02-08 DOI:10.5815/ijieeb.2020.01.01

Naznin Sultana, S. Palaniappan

{"title":"Deceptive Opinion Detection Using Machine Learning Techniques","authors":"Naznin Sultana, S. Palaniappan","doi":"10.5815/ijieeb.2020.01.01","DOIUrl":null,"url":null,"abstract":"Nowadays, online reviews have become a valuable resource for customer decision making before purchasing a product. Research shows that most of the people look at online reviews before purchasing any product. So, customers reviews are now become a crucial part of doing business online. Since review can either promote or demote a product or a service, so buying and selling fake reviews turns into a profitable business for some people now a days. In the past few years, deceptive review detection has attracted significant attention from both the industrial organizations and academic communities. However, the issue remains to be a challenging problem due to the lack of labeled dataset for supervised learning and evaluation. Also, study shows that both the state of the art computational approaches and human readers acquire an error rate of about 35% to 48% in identifying fake reviews. This study thoroughly investigated and analyzed customers’ online reviews for deception detection using different supervised machine learning methods and proposes a machine learning model using stochastic gradient descent algorithm for the detection of spam review. To reduce bias and variance, bagging and boosting approach was integrated into the model. Furthermore, to select the most appropriate features in the feature selection step, some rules using regular expression were also generated. Experiments on hotel review dataset demonstrate the effectiveness of the proposed approach.","PeriodicalId":427770,"journal":{"name":"International Journal of Information Engineering and Electronic Business","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Engineering and Electronic Business","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijieeb.2020.01.01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Nowadays, online reviews have become a valuable resource for customer decision making before purchasing a product. Research shows that most of the people look at online reviews before purchasing any product. So, customers reviews are now become a crucial part of doing business online. Since review can either promote or demote a product or a service, so buying and selling fake reviews turns into a profitable business for some people now a days. In the past few years, deceptive review detection has attracted significant attention from both the industrial organizations and academic communities. However, the issue remains to be a challenging problem due to the lack of labeled dataset for supervised learning and evaluation. Also, study shows that both the state of the art computational approaches and human readers acquire an error rate of about 35% to 48% in identifying fake reviews. This study thoroughly investigated and analyzed customers’ online reviews for deception detection using different supervised machine learning methods and proposes a machine learning model using stochastic gradient descent algorithm for the detection of spam review. To reduce bias and variance, bagging and boosting approach was integrated into the model. Furthermore, to select the most appropriate features in the feature selection step, some rules using regular expression were also generated. Experiments on hotel review dataset demonstrate the effectiveness of the proposed approach.

查看原文本刊更多论文

使用机器学习技术的欺骗性意见检测

如今，在线评论已经成为顾客在购买产品之前做出决策的宝贵资源。研究表明，大多数人在购买任何产品之前都会查看在线评论。因此，客户评论现在成为网上做生意的一个关键部分。由于评论可以促进或降低产品或服务，因此买卖虚假评论现在对一些人来说变成了一项有利可图的业务。在过去的几年里，欺骗性评论的检测受到了业界和学术界的广泛关注。然而，由于缺乏用于监督学习和评估的标记数据集，该问题仍然是一个具有挑战性的问题。此外，研究表明，最先进的计算方法和人类读者在识别虚假评论方面的错误率都在35%到48%之间。本研究使用不同的监督式机器学习方法对客户的在线评论进行了深入的调查和分析，并提出了一种使用随机梯度下降算法检测垃圾评论的机器学习模型。为了减少偏差和方差，将bagging和boosting方法集成到模型中。此外，为了在特征选择步骤中选择最合适的特征，还使用正则表达式生成了一些规则。在酒店评论数据集上的实验证明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Information Engineering and Electronic Business

自引率

0.00%

发文量