{"title":"When Homecoming is not Coming: 2021 Homecoming Ban Sentiment Analysis on Twitter Data Using Support Vector Machine Algorithm","authors":"Lidia Sandra, Ford Lumbangaol","doi":"10.1109/ICISS53185.2021.9533255","DOIUrl":null,"url":null,"abstract":"Homecoming, more traditionally known as Mudik, has become a trending topic on several social media platforms as soon as the 11-day homecoming ritual ban was announced on 7 April 2021. Opinions, varying from those in favor of and against the ban, start to rapidly appear. Twitter, a social media platform which is now considered to be an extension of oneself and often used to express ones’ opinion, has become flooded with comments on the homecoming ritual ban. The swarm of opinions in the form of tweets were then used as a dataset for sentiment analysis in order to understand how people perceive the ban. The algorithm used in this research is the classification algorithm using the Support Vector Machine method. The dataset was classified into three sentiments: positive, negative, and neutral. The use of the Support Vector Machine algorithm yielded a 62% accuracy with this dataset. The sentiment analysis showed that the keyword \"mudik\" had a neutral sentiment for the most part. Meanwhile, results of engagement analysis show that the largest forms of engagements were retweets and liking tweets that had a neutral sentiment. When the neutral sentiment was removed, we found that the largest sentiment on the homecoming ritual ban was negative. This is likely due to the release of an addendum to the Covid-19 Handling Task Force Circular Number 13 of 2021 on 22 April 2021 that imposes more restrictions on and extends the effective dates of the restrictions related to the homecoming ritual ban; exactly one day before the data scraping of 5000 datasets on tweets from 23 April 2021 was carried out. The researcher had already sampled the tweets with the most engagements (those with the most retweets and likes). It was found that some tweets had a negative sentiment, but the model classified it as having a neutral sentiment. This may be affected by inaccuracies of dataset training as some of the tweets were in Malay rather than Indonesian. A challenge that needs to be overcome is the limited number of datasets for NLP training or sentiment analysis for the Indonesian language in comparison to that of the English language. On the other hand, this has become an opportunity for the researcher to develop a more appropriate training model.","PeriodicalId":220371,"journal":{"name":"2021 International Conference on ICT for Smart Society (ICISS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on ICT for Smart Society (ICISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISS53185.2021.9533255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Homecoming, more traditionally known as Mudik, has become a trending topic on several social media platforms as soon as the 11-day homecoming ritual ban was announced on 7 April 2021. Opinions, varying from those in favor of and against the ban, start to rapidly appear. Twitter, a social media platform which is now considered to be an extension of oneself and often used to express ones’ opinion, has become flooded with comments on the homecoming ritual ban. The swarm of opinions in the form of tweets were then used as a dataset for sentiment analysis in order to understand how people perceive the ban. The algorithm used in this research is the classification algorithm using the Support Vector Machine method. The dataset was classified into three sentiments: positive, negative, and neutral. The use of the Support Vector Machine algorithm yielded a 62% accuracy with this dataset. The sentiment analysis showed that the keyword "mudik" had a neutral sentiment for the most part. Meanwhile, results of engagement analysis show that the largest forms of engagements were retweets and liking tweets that had a neutral sentiment. When the neutral sentiment was removed, we found that the largest sentiment on the homecoming ritual ban was negative. This is likely due to the release of an addendum to the Covid-19 Handling Task Force Circular Number 13 of 2021 on 22 April 2021 that imposes more restrictions on and extends the effective dates of the restrictions related to the homecoming ritual ban; exactly one day before the data scraping of 5000 datasets on tweets from 23 April 2021 was carried out. The researcher had already sampled the tweets with the most engagements (those with the most retweets and likes). It was found that some tweets had a negative sentiment, but the model classified it as having a neutral sentiment. This may be affected by inaccuracies of dataset training as some of the tweets were in Malay rather than Indonesian. A challenge that needs to be overcome is the limited number of datasets for NLP training or sentiment analysis for the Indonesian language in comparison to that of the English language. On the other hand, this has become an opportunity for the researcher to develop a more appropriate training model.