增加特征和减少浪费:通过特征生成和特征选择找到中继器的评论

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services Pub Date : 2019-12-02 DOI:10.1145/3366030.3366133

Naoki Muramoto, Hiromi Shiraga, Kilho Shin, Hiroaki Ohshima

{"title":"增加特征和减少浪费:通过特征生成和特征选择找到中继器的评论","authors":"Naoki Muramoto, Hiromi Shiraga, Kilho Shin, Hiroaki Ohshima","doi":"10.1145/3366030.3366133","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed a method for determining whether a given restaurant review comment is a repeater's review, or not. We often use restaurant review sites to decide which restaurant to go to. When we read a restaurant review comment, we can know whether the reviewer is a repeater of the restaurant. If a certain restaurant has many repeaters, the restaurant must be great. However, restaurant review sites usually do not provide a \"revisit rate\". Therefore, we tackle a problem for determining whether a review is a repeater's review, or not. There are many sentences in a review comment that are completely not useful for determining whether the review is a repeater review, such as what was ordered, what was delicious, or how was the price. To confront such difficulties, we have taken the following approach. First, very various features are extracted from review comments so as not to miss the features that represent repeaters' reviews. Next, from the very various features, only the necessary features that really contribute to the classification is selected by a feature selection method. Finally, classification is performed using a classifier. We have implemented the proposed method using super-CWC [12], a state-of-the-art feature selection method, and SVM. The experimental results show that the proposed method is better than other methods.","PeriodicalId":446280,"journal":{"name":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fatten Features and Drop Wastes: Finding Repeaters' Reviews by Feature Generation and Feature Selection\",\"authors\":\"Naoki Muramoto, Hiromi Shiraga, Kilho Shin, Hiroaki Ohshima\",\"doi\":\"10.1145/3366030.3366133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we proposed a method for determining whether a given restaurant review comment is a repeater's review, or not. We often use restaurant review sites to decide which restaurant to go to. When we read a restaurant review comment, we can know whether the reviewer is a repeater of the restaurant. If a certain restaurant has many repeaters, the restaurant must be great. However, restaurant review sites usually do not provide a \\\"revisit rate\\\". Therefore, we tackle a problem for determining whether a review is a repeater's review, or not. There are many sentences in a review comment that are completely not useful for determining whether the review is a repeater review, such as what was ordered, what was delicious, or how was the price. To confront such difficulties, we have taken the following approach. First, very various features are extracted from review comments so as not to miss the features that represent repeaters' reviews. Next, from the very various features, only the necessary features that really contribute to the classification is selected by a feature selection method. Finally, classification is performed using a classifier. We have implemented the proposed method using super-CWC [12], a state-of-the-art feature selection method, and SVM. The experimental results show that the proposed method is better than other methods.\",\"PeriodicalId\":446280,\"journal\":{\"name\":\"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3366030.3366133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366030.3366133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了一种方法来确定给定的餐馆评论是否为重复者的评论。我们经常使用餐馆评论网站来决定去哪家餐馆。当我们阅读一个餐厅的评论评论时，我们可以知道评论者是否是该餐厅的重复者。如果一家餐厅有很多中继器，那么这家餐厅一定很棒。然而，餐厅评论网站通常不提供“重访率”。因此，我们处理的问题是确定一个评审是否是重复者的评审。评论评论中有很多句子对于判断评论是否为重复评论完全没有用处，比如点了什么，什么好吃，或者价格如何。针对这些困难，我们采取了以下措施。首先，从评论评论中提取非常多的特征，以免错过代表中继者评论的特征。接下来，从各种各样的特征中，通过特征选择方法选择真正有助于分类的必要特征。最后，使用分类器执行分类。我们使用super-CWC[12](一种最先进的特征选择方法)和SVM实现了所提出的方法。实验结果表明，该方法优于其他方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fatten Features and Drop Wastes: Finding Repeaters' Reviews by Feature Generation and Feature Selection

In this paper, we proposed a method for determining whether a given restaurant review comment is a repeater's review, or not. We often use restaurant review sites to decide which restaurant to go to. When we read a restaurant review comment, we can know whether the reviewer is a repeater of the restaurant. If a certain restaurant has many repeaters, the restaurant must be great. However, restaurant review sites usually do not provide a "revisit rate". Therefore, we tackle a problem for determining whether a review is a repeater's review, or not. There are many sentences in a review comment that are completely not useful for determining whether the review is a repeater review, such as what was ordered, what was delicious, or how was the price. To confront such difficulties, we have taken the following approach. First, very various features are extracted from review comments so as not to miss the features that represent repeaters' reviews. Next, from the very various features, only the necessary features that really contribute to the classification is selected by a feature selection method. Finally, classification is performed using a classifier. We have implemented the proposed method using super-CWC [12], a state-of-the-art feature selection method, and SVM. The experimental results show that the proposed method is better than other methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services

自引率

0.00%

发文量