{"title":"Feature Expansion for Sentiment Analysis in Twitter","authors":"E. B. Setiawan, D. H. Widyantoro, K. Surendro","doi":"10.1109/EECSI.2018.8752851","DOIUrl":null,"url":null,"abstract":"The community's need for social media is increasing, since the media can be used to express their opinion, especially the Twitter. Sentiment analysis can be used to understand public opinion a topic where the accuracy can be measured and improved by several methods. In this paper, we introduce a hybrid method that combines: (a) basic features and feature expansion based on Term Frequency–Inverse Document Frequency (TF-IDF) and (b) basic features and feature expansion based on tweet-based features. We train three most common classifiers for this field, i.e., Support Vector Machine (SVM), Logistic Regression (Logit), and Naïve Bayes (NB). From those two feature expansions, we do notice a significant increase in feature expansion with tweet-based features rather than based on TF-IDF, where the highest accuracy of 98.81% is achieved in Logistic Regression Classifier.","PeriodicalId":6543,"journal":{"name":"2018 5th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI)","volume":"7 1","pages":"509-513"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EECSI.2018.8752851","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
The community's need for social media is increasing, since the media can be used to express their opinion, especially the Twitter. Sentiment analysis can be used to understand public opinion a topic where the accuracy can be measured and improved by several methods. In this paper, we introduce a hybrid method that combines: (a) basic features and feature expansion based on Term Frequency–Inverse Document Frequency (TF-IDF) and (b) basic features and feature expansion based on tweet-based features. We train three most common classifiers for this field, i.e., Support Vector Machine (SVM), Logistic Regression (Logit), and Naïve Bayes (NB). From those two feature expansions, we do notice a significant increase in feature expansion with tweet-based features rather than based on TF-IDF, where the highest accuracy of 98.81% is achieved in Logistic Regression Classifier.