{"title":"Lexicon Based Twitter Sentiment Analysis for Vote Share Prediction Using Emoji and N-gram Features","authors":"Barkha Bansal, S. Srivastava","doi":"10.1504/IJWBC.2019.10018048","DOIUrl":null,"url":null,"abstract":"Recently, Twitter sentiment analysis (TSA) has been successfully employed to monitor and forecast elections in many studies. However, most of the existing studies rely on extracting sentiments from explicit textual features. Moreover, only few studies have included non-textual features such as emojis for election forecasts. In this study, we incorporated N-gram features to predict vote shares of 2017 Uttar Pradesh (UP) legislative elections. Also, sentiment distribution of tweets containing emojis was significantly different from tweets without emojis. Therefore, emoji sentiments were detected and incorporated to predict the vote shares. We collected more than 0.3 million tweets, wherein geo-tagging was applied on search keywords that were not exclusive to elections. We employed seven lexicons for labelling tweets and compared two methods to reduce prediction error: sentiment magnitude-based criteria and polarity of tweets. Results show that proposed method of incorporating N-gram features and emoji sentiments significantly decreases prediction error.","PeriodicalId":39041,"journal":{"name":"International Journal of Web Based Communities","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Web Based Communities","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJWBC.2019.10018048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 19
Abstract
Recently, Twitter sentiment analysis (TSA) has been successfully employed to monitor and forecast elections in many studies. However, most of the existing studies rely on extracting sentiments from explicit textual features. Moreover, only few studies have included non-textual features such as emojis for election forecasts. In this study, we incorporated N-gram features to predict vote shares of 2017 Uttar Pradesh (UP) legislative elections. Also, sentiment distribution of tweets containing emojis was significantly different from tweets without emojis. Therefore, emoji sentiments were detected and incorporated to predict the vote shares. We collected more than 0.3 million tweets, wherein geo-tagging was applied on search keywords that were not exclusive to elections. We employed seven lexicons for labelling tweets and compared two methods to reduce prediction error: sentiment magnitude-based criteria and polarity of tweets. Results show that proposed method of incorporating N-gram features and emoji sentiments significantly decreases prediction error.