基于Bat算法的推特用户情感分类机器学习优化

Ema Utami, Suwanto Raharjo, Omar Muhammad Altoumi Alsyaibani, Candra Adipradana
{"title":"基于Bat算法的推特用户情感分类机器学习优化","authors":"Ema Utami, Suwanto Raharjo, Omar Muhammad Altoumi Alsyaibani, Candra Adipradana","doi":"10.1109/ICBATS54253.2022.9759029","DOIUrl":null,"url":null,"abstract":"Social-media is a very effective communication media in today’s digital era. Twitter is one of them which widely used by Internet users. Huge number of tweets has encouraged research in the field of text mining, especially in sentiment analysis. Most of sentiment analysis researches which mined data in Bahasa used TF-IDF to assign weight on every word in corpus. This traditional method resulted low accuracy when tested using machine learning methods. In this study, instead of using TF-IDF, we implemented Bat Algorithm to weight every word in corpus. We tested this on Naïve Bayes, Decision Tree and K-NN methods. The result of this study shows that Naïve Bayes, Decision Tree and K-NN methods which classified data weighted using TF-IDF reached accuracy 33.58%, 32.82% and 33.61%, respectively. Afterwards, words in corpus were weighted using Bat Algorithm and tested using the same methods. The test result shows that Naïve Bayes, Decision Tree and K-NN methods reached 39.01%, 76.63% and 66.15% in respectively. It can be inferred that Bat Algorithm usage for weighting words in corpus improves machine learning algorithms to classify sentiment of Twitter users. Moreover, it can be identified that the biggest improvement occurred in Decision Tree algorithm which increased 43.81% accuracy. On the other hand, improvement in Naïve Bayes algorithm is still minor compared to other machine learning algorithms.","PeriodicalId":289224,"journal":{"name":"2022 International Conference on Business Analytics for Technology and Security (ICBATS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Optimization using Bat Algorithm to Classify Sentiment of Twitter Users\",\"authors\":\"Ema Utami, Suwanto Raharjo, Omar Muhammad Altoumi Alsyaibani, Candra Adipradana\",\"doi\":\"10.1109/ICBATS54253.2022.9759029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social-media is a very effective communication media in today’s digital era. Twitter is one of them which widely used by Internet users. Huge number of tweets has encouraged research in the field of text mining, especially in sentiment analysis. Most of sentiment analysis researches which mined data in Bahasa used TF-IDF to assign weight on every word in corpus. This traditional method resulted low accuracy when tested using machine learning methods. In this study, instead of using TF-IDF, we implemented Bat Algorithm to weight every word in corpus. We tested this on Naïve Bayes, Decision Tree and K-NN methods. The result of this study shows that Naïve Bayes, Decision Tree and K-NN methods which classified data weighted using TF-IDF reached accuracy 33.58%, 32.82% and 33.61%, respectively. Afterwards, words in corpus were weighted using Bat Algorithm and tested using the same methods. The test result shows that Naïve Bayes, Decision Tree and K-NN methods reached 39.01%, 76.63% and 66.15% in respectively. It can be inferred that Bat Algorithm usage for weighting words in corpus improves machine learning algorithms to classify sentiment of Twitter users. Moreover, it can be identified that the biggest improvement occurred in Decision Tree algorithm which increased 43.81% accuracy. On the other hand, improvement in Naïve Bayes algorithm is still minor compared to other machine learning algorithms.\",\"PeriodicalId\":289224,\"journal\":{\"name\":\"2022 International Conference on Business Analytics for Technology and Security (ICBATS)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Business Analytics for Technology and Security (ICBATS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICBATS54253.2022.9759029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Business Analytics for Technology and Security (ICBATS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBATS54253.2022.9759029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在当今的数字时代,社交媒体是一种非常有效的沟通媒体。Twitter就是其中之一,它被互联网用户广泛使用。大量的推文鼓励了文本挖掘领域的研究,特别是在情感分析方面。大多数挖掘马来语数据的情感分析研究都使用TF-IDF对语料库中的每个词分配权重。当使用机器学习方法进行测试时,这种传统方法的准确性较低。在本研究中,我们没有使用TF-IDF,而是使用Bat算法对语料库中的每个词进行加权。我们在Naïve贝叶斯、决策树和K-NN方法上进行了测试。研究结果表明,Naïve使用TF-IDF加权的Bayes、Decision Tree和K-NN方法对数据进行分类,准确率分别达到33.58%、32.82%和33.61%。然后,使用Bat算法对语料库中的词进行加权,并使用相同的方法进行测试。测试结果表明Naïve贝叶斯、决策树和K-NN方法的准确率分别达到了39.01%、76.63%和66.15%。可以推断,使用Bat算法对语料库中的单词进行加权,可以改进机器学习算法对Twitter用户的情绪进行分类。其中,决策树算法的准确率提高幅度最大,提高了43.81%。另一方面,与其他机器学习算法相比,Naïve Bayes算法的改进仍然很小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine Learning Optimization using Bat Algorithm to Classify Sentiment of Twitter Users
Social-media is a very effective communication media in today’s digital era. Twitter is one of them which widely used by Internet users. Huge number of tweets has encouraged research in the field of text mining, especially in sentiment analysis. Most of sentiment analysis researches which mined data in Bahasa used TF-IDF to assign weight on every word in corpus. This traditional method resulted low accuracy when tested using machine learning methods. In this study, instead of using TF-IDF, we implemented Bat Algorithm to weight every word in corpus. We tested this on Naïve Bayes, Decision Tree and K-NN methods. The result of this study shows that Naïve Bayes, Decision Tree and K-NN methods which classified data weighted using TF-IDF reached accuracy 33.58%, 32.82% and 33.61%, respectively. Afterwards, words in corpus were weighted using Bat Algorithm and tested using the same methods. The test result shows that Naïve Bayes, Decision Tree and K-NN methods reached 39.01%, 76.63% and 66.15% in respectively. It can be inferred that Bat Algorithm usage for weighting words in corpus improves machine learning algorithms to classify sentiment of Twitter users. Moreover, it can be identified that the biggest improvement occurred in Decision Tree algorithm which increased 43.81% accuracy. On the other hand, improvement in Naïve Bayes algorithm is still minor compared to other machine learning algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信