Sentiment Analysis for Review Rating Prediction in a Travel Journal

Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2020-12-18 DOI:10.1145/3443279.3443282

Jovelyn C. Cuizon, Carlos Giovanni Agravante

{"title":"Sentiment Analysis for Review Rating Prediction in a Travel Journal","authors":"Jovelyn C. Cuizon, Carlos Giovanni Agravante","doi":"10.1145/3443279.3443282","DOIUrl":null,"url":null,"abstract":"This paper presents sentiment analysis to predict numerical rating of text reviews in a web-based travel journal application. The application allows users to record and provide text reviews on tourist spots visited. Text reviews undergo parts-of-speech (POS) tagging, rule-based phrase chunking and dependency parsing to extract opinion phrases in noun-adjective and noun-verb pairs from the original text. Each pair is further classified to one of the four categories: accommodation, food, entertainment and tourist attraction using the noun against a curated bag-of-words (BOW) to ensure that only relevant statements are included in the scoring. Word Sense Disambiguation is performed to correctly identify the word sense that matches the meaning of the sentence using WordNet. SentiWordNet, a lexical resource for sentiment analysis, was used to determine polarity score representing the emotional intensity of the review. The system predicted star rating was compared with the actual author rating in Google Maps and with human annotator ratings who are asked to label the text reviews. The predicted rating scored low mean absolute error (MAE) between the system and human rating which means that the rating predicted is closer to human interpretation of the text reviews. Overall rating prediction accuracy is 82%.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3443279.3443282","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

This paper presents sentiment analysis to predict numerical rating of text reviews in a web-based travel journal application. The application allows users to record and provide text reviews on tourist spots visited. Text reviews undergo parts-of-speech (POS) tagging, rule-based phrase chunking and dependency parsing to extract opinion phrases in noun-adjective and noun-verb pairs from the original text. Each pair is further classified to one of the four categories: accommodation, food, entertainment and tourist attraction using the noun against a curated bag-of-words (BOW) to ensure that only relevant statements are included in the scoring. Word Sense Disambiguation is performed to correctly identify the word sense that matches the meaning of the sentence using WordNet. SentiWordNet, a lexical resource for sentiment analysis, was used to determine polarity score representing the emotional intensity of the review. The system predicted star rating was compared with the actual author rating in Google Maps and with human annotator ratings who are asked to label the text reviews. The predicted rating scored low mean absolute error (MAE) between the system and human rating which means that the rating predicted is closer to human interpretation of the text reviews. Overall rating prediction accuracy is 82%.

查看原文本刊更多论文

基于情感分析的旅游杂志评论评分预测

本文提出了一种基于情感分析的预测基于网络的旅游期刊文本评论数值评级的方法。该应用程序允许用户记录并提供旅游景点的文字评论。文本审阅通过词性标注、基于规则的短语分块和依赖关系分析，从原文中提取名词-形容词和名词-动词对的观点短语。每对词汇都被进一步划分为四个类别中的一个:住宿、食物、娱乐和旅游景点。为了确保得分中只包含相关的陈述，我们使用了词汇包(BOW)来对照名词。使用WordNet进行词义消歧，以正确识别与句子含义匹配的词义。情感分析的词汇资源SentiWordNet被用来确定代表评论情感强度的极性分数。系统预测的星级与谷歌地图上的实际作者评级以及被要求标注文本评论的人类注释者评级进行了比较。预测的评分在系统和人类评分之间的平均绝对误差(MAE)较低，这意味着预测的评分更接近人类对文本评论的解释。总体评级预测准确率为82%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval

自引率

0.00%

发文量