{"title":"Sentiment Analysis and Topic Modelling on Crowdsourced Data","authors":"Maria Angelika H Siallagan, Arie Wahyu Wijayanto","doi":"10.24014/ijaidm.v7i1.24777","DOIUrl":null,"url":null,"abstract":"Data analysis plays a crucial role in enhancing the decision-making process by uncovering concealed patterns within the data. One valuable form of crowdsourced data is user reviews on applications, which can effectively capture the satisfaction levels of application users. Application developers can utilize these reviews to identify and assess areas of the application that require evaluation or improvement. This study focuses on the classification of application reviews by utilizing sentiment analysis and employs various classification algorithms, including logistic regression, Support Vector Machines, and Random Forest. Additionally, to address negative sentiment labels, topic modeling is conducted using Latent Dirichlet Allocation (LDA). This study demonstrates that the best sentiment classification model is logistic regression, achieving an average accuracy of 0.925 and an average F1-score of 0.763. Furthermore, the LDA analysis successfully generates topic models for negative reviews, revealing three key topics: price-related issues, accessibility concerns, and application accuracy, all of which demand reevaluation and potential improvement","PeriodicalId":385582,"journal":{"name":"Indonesian Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indonesian Journal of Artificial Intelligence and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24014/ijaidm.v7i1.24777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data analysis plays a crucial role in enhancing the decision-making process by uncovering concealed patterns within the data. One valuable form of crowdsourced data is user reviews on applications, which can effectively capture the satisfaction levels of application users. Application developers can utilize these reviews to identify and assess areas of the application that require evaluation or improvement. This study focuses on the classification of application reviews by utilizing sentiment analysis and employs various classification algorithms, including logistic regression, Support Vector Machines, and Random Forest. Additionally, to address negative sentiment labels, topic modeling is conducted using Latent Dirichlet Allocation (LDA). This study demonstrates that the best sentiment classification model is logistic regression, achieving an average accuracy of 0.925 and an average F1-score of 0.763. Furthermore, the LDA analysis successfully generates topic models for negative reviews, revealing three key topics: price-related issues, accessibility concerns, and application accuracy, all of which demand reevaluation and potential improvement