Performance Comparison of SVM, Naïve Bayes, and KNN Algorithms for Analysis of Public Opinion Sentiment Against COVID-19 Vaccination on Twitter

Khafifah Munawaroh, A. Alamsyah
{"title":"Performance Comparison of SVM, Naïve Bayes, and KNN Algorithms for Analysis of Public Opinion Sentiment Against COVID-19 Vaccination on Twitter","authors":"Khafifah Munawaroh, A. Alamsyah","doi":"10.15294/jaist.v4i2.59493","DOIUrl":null,"url":null,"abstract":"\n \n \n \nThe emergence of the COVID-19 virus in 2020 has created a new breakthrough in the form of a vaccine as a solution to slow the spread of the virus. However, the COVID-19 vaccine is considered controversial and invites many people to express their views on various media, one of which is social media Twitter. Using Twitter data on the COVID-19 vaccine, sentiment analysis can be performed. Sentiment analysis aims to evaluate whether the tweet contains a positive sentence or sentiment. In this study, the analysis of sentiments on the COVID-19 vaccine on social media Twitter was carried out using the Support Vector Machine (SVM), Naïve Bayes, and k-Nearest Neighbor (KNN) algorithms. SVM has the advantage of being able to identify hyperplanes that maximize the margins between different sentiments. Meanwhile Naïve Bayes is an algorithm that is simple, fast and produces maximum accuracy with training. The KNN algorithm was chosen because it is superior to noise. The performance of the three classification algorithms will be compared, so that it can be seen which algorithm is better in classifying text mining. Sentiment classification results in this study consist of positive sentiment and sentiment classes. The resulting accuracy value will be a benchmark for finding the best test model in the case of sentiment classification. Based on ten tests, the final result of accuracy and best performance using the SVM algorithm with an accuracy value of 96.3% is obtained. Meanwhile, the Naïve Bayes and KNN algorithms have an accuracy of 94% and 91%, respectively. The high accuracy results are supported by the feature extraction TF-IDF the TextBlob library. \n \n \n \n","PeriodicalId":418742,"journal":{"name":"Journal of Advances in Information Systems and Technology","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advances in Information Systems and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15294/jaist.v4i2.59493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The emergence of the COVID-19 virus in 2020 has created a new breakthrough in the form of a vaccine as a solution to slow the spread of the virus. However, the COVID-19 vaccine is considered controversial and invites many people to express their views on various media, one of which is social media Twitter. Using Twitter data on the COVID-19 vaccine, sentiment analysis can be performed. Sentiment analysis aims to evaluate whether the tweet contains a positive sentence or sentiment. In this study, the analysis of sentiments on the COVID-19 vaccine on social media Twitter was carried out using the Support Vector Machine (SVM), Naïve Bayes, and k-Nearest Neighbor (KNN) algorithms. SVM has the advantage of being able to identify hyperplanes that maximize the margins between different sentiments. Meanwhile Naïve Bayes is an algorithm that is simple, fast and produces maximum accuracy with training. The KNN algorithm was chosen because it is superior to noise. The performance of the three classification algorithms will be compared, so that it can be seen which algorithm is better in classifying text mining. Sentiment classification results in this study consist of positive sentiment and sentiment classes. The resulting accuracy value will be a benchmark for finding the best test model in the case of sentiment classification. Based on ten tests, the final result of accuracy and best performance using the SVM algorithm with an accuracy value of 96.3% is obtained. Meanwhile, the Naïve Bayes and KNN algorithms have an accuracy of 94% and 91%, respectively. The high accuracy results are supported by the feature extraction TF-IDF the TextBlob library.
支持向量机、Naïve贝叶斯和KNN算法在Twitter上对COVID-19疫苗接种的民意情绪分析中的性能比较
2019冠状病毒病(COVID-19)在2020年的出现创造了新的突破,以疫苗的形式作为减缓病毒传播的解决方案。然而,COVID-19疫苗被认为是有争议的,并邀请许多人在各种媒体上表达他们的观点,其中一个是社交媒体推特。利用Twitter上关于COVID-19疫苗的数据,可以进行情绪分析。情感分析的目的是评估推文是否包含积极的句子或情感。在本研究中,使用支持向量机(SVM)、Naïve贝叶斯和k-最近邻(KNN)算法对社交媒体Twitter上对COVID-19疫苗的情绪进行分析。支持向量机的优势在于能够识别使不同情绪之间的边界最大化的超平面。同时Naïve贝叶斯是一种简单、快速、通过训练产生最大准确率的算法。选择KNN算法是因为它比噪声更优。将比较三种分类算法的性能,从而看出哪种算法在文本挖掘分类中表现更好。本研究的情绪分类结果包括积极情绪和情绪类别。在情感分类的情况下,得到的准确度值将作为寻找最佳测试模型的基准。经过10次测试,最终得到了SVM算法的准确率和最佳性能,准确率为96.3%。同时,Naïve Bayes和KNN算法的准确率分别为94%和91%。TextBlob库的特征提取TF-IDF支持高精度的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信