Aspect Analysis of Cebu Establishments' Online Reviews using k-means Clustering and word2vec

Kris Capao, Ken Gorro, Kim D. Gorro, M. J. Sabellano, C. Militante, Justin Paul C. Manalili
{"title":"Aspect Analysis of Cebu Establishments' Online Reviews using k-means Clustering and word2vec","authors":"Kris Capao, Ken Gorro, Kim D. Gorro, M. J. Sabellano, C. Militante, Justin Paul C. Manalili","doi":"10.1109/CCOMS.2018.8463246","DOIUrl":null,"url":null,"abstract":"Customer reviews are important part to any business. With the development of the technology, customer reviews are usually found on the internet. In this study, online reviews from different Cebu establishments were gathered using selenium as web scraper. A total of 3776 online reviews were gathered. Word2vec and k-means clustering were utilized to analyze and discover different online review corpora. To identify the best number of clusters, a series of experiments were conducted to find for the best Silhouette coefficient. For better analysis of k-means clustering, open coding was used to understand the significant qualitative codes. Based on the k-means clustering results, the following qualitative codes were identified: time, staff, friendly, service, affordable, love, food, price, ambiance, good, great, relax. Analyses of the clusters show that quality service, tasty and affordable food and good atmosphere are the significant aspect that the online reviews are concerned. Based on the word2vec results, the researchers focused on the following words: Waiters, relax, great, ambiance, service and tasty. The results of the study provide meaningful insights on the group of words obtained using the analogy to word2vec model, as well as the subject focus of the categories.","PeriodicalId":405664,"journal":{"name":"2018 3rd International Conference on Computer and Communication Systems (ICCCS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 3rd International Conference on Computer and Communication Systems (ICCCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCOMS.2018.8463246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Customer reviews are important part to any business. With the development of the technology, customer reviews are usually found on the internet. In this study, online reviews from different Cebu establishments were gathered using selenium as web scraper. A total of 3776 online reviews were gathered. Word2vec and k-means clustering were utilized to analyze and discover different online review corpora. To identify the best number of clusters, a series of experiments were conducted to find for the best Silhouette coefficient. For better analysis of k-means clustering, open coding was used to understand the significant qualitative codes. Based on the k-means clustering results, the following qualitative codes were identified: time, staff, friendly, service, affordable, love, food, price, ambiance, good, great, relax. Analyses of the clusters show that quality service, tasty and affordable food and good atmosphere are the significant aspect that the online reviews are concerned. Based on the word2vec results, the researchers focused on the following words: Waiters, relax, great, ambiance, service and tasty. The results of the study provide meaningful insights on the group of words obtained using the analogy to word2vec model, as well as the subject focus of the categories.
基于k-means聚类和word2vec的宿雾餐饮业在线评价方面分析
客户评论对任何业务来说都是重要的一部分。随着科技的发展,顾客评论通常是在互联网上找到的。本研究以硒作为网页刮板,收集宿雾不同场所的网上评论。总共收集了3776条在线评论。使用Word2vec和k-means聚类对不同的在线评论语料库进行分析和发现。为了确定最佳的聚类数,进行了一系列的实验来寻找最佳的剪影系数。为了更好地分析k-means聚类,使用开放编码来理解重要的定性编码。根据k-means聚类结果,确定了以下定性代码:时间、员工、友好、服务、负担得起、爱、食物、价格、氛围、好、很棒、放松。对集群的分析表明,优质的服务、美味实惠的食物和良好的氛围是在线评论关注的重要方面。根据word2vec的结果,研究人员将重点放在了以下几个词上:服务员、放松、很棒、氛围、服务和美味。研究结果对使用类比word2vec模型得到的词组以及类别的主题焦点提供了有意义的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信