基于监督情感分析方法的意见垃圾检测

Sepideh Jamshidi Nejad, Fatemeh Ahmadi-Abkenari, P. Bayat
{"title":"基于监督情感分析方法的意见垃圾检测","authors":"Sepideh Jamshidi Nejad, Fatemeh Ahmadi-Abkenari, P. Bayat","doi":"10.1109/ICCKE50421.2020.9303677","DOIUrl":null,"url":null,"abstract":"Reading other user's experience on different products and services becomes part of customer's behavior for purchase decision making process nowadays. For this reason, such online resources grow into a target for review spammers with the aim of either boosting their desired products or destroying the reputation of their competitors. Distinguishing between spam and true expressed sentiments is highly challenging due to the fact that this process demands linguistic and grammatical knowledge. Because of the language dependency nature of opinion analysis, natural language processing and opinion mining fields are utilized to overcome the challenges in each language. In this paper, we focus on building up a feature set to be employed with different classifiers as a trustworthy input for opinion spam detection in Persian language. Research on review spam detection in English uncovered some meaningful features so far that we utilized some of them, modify the meaning and usage of some of them to be adapted on Persian language and also add some innovative features accordingly. Our experiments reveal that first Decision Tree and then AdaBoost with the accuracy percentage of 98.67 and 98.00 are the best classifiers for Persian opinion spam detection. Also, the robustness of our extended feature set has been checked in comparison to other feature sets in the task of opinion spam detection.","PeriodicalId":402043,"journal":{"name":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"288 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Opinion Spam Detection based on Supervised Sentiment Analysis Approach\",\"authors\":\"Sepideh Jamshidi Nejad, Fatemeh Ahmadi-Abkenari, P. Bayat\",\"doi\":\"10.1109/ICCKE50421.2020.9303677\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reading other user's experience on different products and services becomes part of customer's behavior for purchase decision making process nowadays. For this reason, such online resources grow into a target for review spammers with the aim of either boosting their desired products or destroying the reputation of their competitors. Distinguishing between spam and true expressed sentiments is highly challenging due to the fact that this process demands linguistic and grammatical knowledge. Because of the language dependency nature of opinion analysis, natural language processing and opinion mining fields are utilized to overcome the challenges in each language. In this paper, we focus on building up a feature set to be employed with different classifiers as a trustworthy input for opinion spam detection in Persian language. Research on review spam detection in English uncovered some meaningful features so far that we utilized some of them, modify the meaning and usage of some of them to be adapted on Persian language and also add some innovative features accordingly. Our experiments reveal that first Decision Tree and then AdaBoost with the accuracy percentage of 98.67 and 98.00 are the best classifiers for Persian opinion spam detection. Also, the robustness of our extended feature set has been checked in comparison to other feature sets in the task of opinion spam detection.\",\"PeriodicalId\":402043,\"journal\":{\"name\":\"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"volume\":\"288 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE50421.2020.9303677\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE50421.2020.9303677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

阅读其他用户对不同产品和服务的体验成为当今消费者购买决策过程中的行为之一。由于这个原因,这样的在线资源成为垃圾评论者的目标,目的要么是提升他们想要的产品,要么是破坏竞争对手的声誉。区分垃圾邮件和真实表达的情感是非常具有挑战性的,因为这个过程需要语言和语法知识。由于意见分析的语言依赖性,自然语言处理和意见挖掘领域被用来克服每种语言的挑战。在本文中,我们重点构建了一个特征集,作为波斯语中意见垃圾检测的可信输入,并将其与不同的分类器结合使用。在对英语评论垃圾邮件检测的研究中,我们发现了一些有意义的特征,我们利用了其中的一些特征,修改了其中一些特征的含义和用法,以适应波斯语,并相应地增加了一些创新的特征。我们的实验表明,首先是Decision Tree,然后是AdaBoost,准确率分别为98.67和98.00,是波斯语意见垃圾邮件检测的最佳分类器。此外,我们的扩展特征集的鲁棒性已经在意见垃圾检测任务中与其他特征集进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Opinion Spam Detection based on Supervised Sentiment Analysis Approach
Reading other user's experience on different products and services becomes part of customer's behavior for purchase decision making process nowadays. For this reason, such online resources grow into a target for review spammers with the aim of either boosting their desired products or destroying the reputation of their competitors. Distinguishing between spam and true expressed sentiments is highly challenging due to the fact that this process demands linguistic and grammatical knowledge. Because of the language dependency nature of opinion analysis, natural language processing and opinion mining fields are utilized to overcome the challenges in each language. In this paper, we focus on building up a feature set to be employed with different classifiers as a trustworthy input for opinion spam detection in Persian language. Research on review spam detection in English uncovered some meaningful features so far that we utilized some of them, modify the meaning and usage of some of them to be adapted on Persian language and also add some innovative features accordingly. Our experiments reveal that first Decision Tree and then AdaBoost with the accuracy percentage of 98.67 and 98.00 are the best classifiers for Persian opinion spam detection. Also, the robustness of our extended feature set has been checked in comparison to other feature sets in the task of opinion spam detection.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信