{"title":"Aspect Based Sentiment Analysis: Restaurant Online Review Platform in Indonesia with Unsupervised Scraped Corpus in Indonesian Language","authors":"Samuel Mahatmaputra Tedjojuwono, Clement Neonardi","doi":"10.1109/iccsai53272.2021.9609794","DOIUrl":null,"url":null,"abstract":"The paper has designed a dynamic dashboard that will show a summarized information of restaurants in Indonesia on four distinct metrics which are Food, Service, Ambience and Covid Safety. Each metrics shown will have their own ratings which shows the detailed score for each aspect of the restaurant. The data inside the dashboard have been developed by using semi supervised learning of aspect-based sentiment analysis approach. The idea is to analyze past reviews/comments of each restaurant in the current restaurant's online review platform and extract the sentiment as well as the aspect of each of the reviews. The restaurant lists and the reviews have been collected through web scraping method on one of the most used online review platforms in Indonesia which is Tripadvisor. Scraped data has been cleaned through several process of data pre-processing by utilizing Sastrawi and NLTK library for Indonesian languages. The machine learning tools that will extract the aspect and sentiments in every of the reviews will be built by applying Monkeylearn machine learning platform through APIs. Cleaned datasets have been imported into the platform for data annotations of model training to identify the set of words belongs in each aspect categories as well as their sentiment values. Although after reaching the end of the analysis, this paper has concluded that accuracy of the analysis may not be ideal due to lack of negative sentiment dataset being gathered which affects the model during the training process. In conclusion, the feature has successfully been built and implemented as well as deployed into a web server which supported by Ngrok services however, there are still more room for improvement regarding the analysis of the model.","PeriodicalId":426993,"journal":{"name":"2021 1st International Conference on Computer Science and Artificial Intelligence (ICCSAI)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 1st International Conference on Computer Science and Artificial Intelligence (ICCSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iccsai53272.2021.9609794","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The paper has designed a dynamic dashboard that will show a summarized information of restaurants in Indonesia on four distinct metrics which are Food, Service, Ambience and Covid Safety. Each metrics shown will have their own ratings which shows the detailed score for each aspect of the restaurant. The data inside the dashboard have been developed by using semi supervised learning of aspect-based sentiment analysis approach. The idea is to analyze past reviews/comments of each restaurant in the current restaurant's online review platform and extract the sentiment as well as the aspect of each of the reviews. The restaurant lists and the reviews have been collected through web scraping method on one of the most used online review platforms in Indonesia which is Tripadvisor. Scraped data has been cleaned through several process of data pre-processing by utilizing Sastrawi and NLTK library for Indonesian languages. The machine learning tools that will extract the aspect and sentiments in every of the reviews will be built by applying Monkeylearn machine learning platform through APIs. Cleaned datasets have been imported into the platform for data annotations of model training to identify the set of words belongs in each aspect categories as well as their sentiment values. Although after reaching the end of the analysis, this paper has concluded that accuracy of the analysis may not be ideal due to lack of negative sentiment dataset being gathered which affects the model during the training process. In conclusion, the feature has successfully been built and implemented as well as deployed into a web server which supported by Ngrok services however, there are still more room for improvement regarding the analysis of the model.