Twitter Data Sentimen Analysis 2024 Presidential Candidate Using Algorithm Naïve Bayes Classifier By Methods K-Fold Cross Validation

Aldi Prajela, Syafriandi Syafriandi, Dony Permana, Dina Fitria
{"title":"Twitter Data Sentimen Analysis 2024 Presidential Candidate Using Algorithm Naïve Bayes Classifier By Methods K-Fold Cross Validation","authors":"Aldi Prajela, Syafriandi Syafriandi, Dony Permana, Dina Fitria","doi":"10.24036/ujsds/vol2-iss1/149","DOIUrl":null,"url":null,"abstract":"Indonesia implements a democratic system by involving the public in General Elections (Pemilu) for specific political positions. The active community expresses opinions on social media, especially regarding the 2024 Presidential Election (Pilpres) and respective presidential candidates, which have become trending topics on Twitter. The analysis used to absorb these tweets into information is sentimen analysis using the Naïve Bayes Classifier algorithm with the K-fold Cross-Validation method. Through the stages of pre-processing, weighting, labeling, classification using NBC, and testing using a Confusion Matrix, The results of the classification from NBC showed that Anies got 80% positive tweets and 20% negative tweets from 1186 tweets, Prabowo Subianto got 78% positive tweets and 22% negative tweets from 1149 tweets, and Ganjar Pranowo got 77% positive tweets and 23% negative tweets from 1075 tweets. Testing process was carried out using the NBC algorithm with the K-Fold Cross Validation method using values k=1 to k=10. The function of K-Fold Cross Validation is to maximize the confusion matrix result. It can be conclude that Anies Baswedan has the highest score in iteration 4, namely a precision value of 90%, a recall value of 99%, and the accurary value of 91%. Furthemore, Ganjar Pranowo had the highest score in iteration 9, namely a precision value of 95%,a recall value of  97%, and an accuracy value of 92%. Meanwhile, Prabowo Subianto had the highest score in iteration 9, namely a precision value of 97%, a recall value of 99%, and an accuracy value of 94%.","PeriodicalId":220933,"journal":{"name":"UNP Journal of Statistics and Data Science","volume":"10 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"UNP Journal of Statistics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24036/ujsds/vol2-iss1/149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Indonesia implements a democratic system by involving the public in General Elections (Pemilu) for specific political positions. The active community expresses opinions on social media, especially regarding the 2024 Presidential Election (Pilpres) and respective presidential candidates, which have become trending topics on Twitter. The analysis used to absorb these tweets into information is sentimen analysis using the Naïve Bayes Classifier algorithm with the K-fold Cross-Validation method. Through the stages of pre-processing, weighting, labeling, classification using NBC, and testing using a Confusion Matrix, The results of the classification from NBC showed that Anies got 80% positive tweets and 20% negative tweets from 1186 tweets, Prabowo Subianto got 78% positive tweets and 22% negative tweets from 1149 tweets, and Ganjar Pranowo got 77% positive tweets and 23% negative tweets from 1075 tweets. Testing process was carried out using the NBC algorithm with the K-Fold Cross Validation method using values k=1 to k=10. The function of K-Fold Cross Validation is to maximize the confusion matrix result. It can be conclude that Anies Baswedan has the highest score in iteration 4, namely a precision value of 90%, a recall value of 99%, and the accurary value of 91%. Furthemore, Ganjar Pranowo had the highest score in iteration 9, namely a precision value of 95%,a recall value of  97%, and an accuracy value of 92%. Meanwhile, Prabowo Subianto had the highest score in iteration 9, namely a precision value of 97%, a recall value of 99%, and an accuracy value of 94%.
使用 Naïve Bayes 分类器算法和 K-Fold 交叉验证法分析 2024 年总统候选人的 Twitter 数据句子
印度尼西亚实行民主制度,让公众参与特定政治职位的大选(Pemilu)。活跃的社区在社交媒体上发表意见,尤其是关于 2024 年总统选举(Pilpres)和相关总统候选人的意见,这些意见已成为 Twitter 上的热门话题。将这些推文吸收为信息的分析方法是使用奈伊夫贝叶斯分类器算法和 K 倍交叉验证法进行句法分析。通过预处理、加权、标记、使用 NBC 分类和使用混淆矩阵测试等阶段,NBC 的分类结果显示,从 1186 条推文中,安尼获得了 80% 的正面推文和 20% 的负面推文;从 1149 条推文中,普拉博沃-苏比安托获得了 78% 的正面推文和 22% 的负面推文;从 1075 条推文中,甘贾尔-普拉诺沃获得了 77% 的正面推文和 23% 的负面推文。测试过程采用了 NBC 算法和 K-Fold 交叉验证法,使用的值为 k=1 至 k=10。K 倍交叉验证的功能是使混淆矩阵结果最大化。可以得出结论,在迭代 4 中,Aies Baswedan 的得分最高,即精确度值为 90%,召回值为 99%,准确度值为 91%。此外,Ganjar Pranowo 在第 9 次迭代中得分最高,精确度为 95%,召回率为 97%,准确率为 92%。同时,普拉博沃-苏比安托(Prabowo Subianto)在第 9 次迭代中得分最高,即精确率为 97%,召回率为 99%,准确率为 94%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信