基于Bat算法的推特用户情感分类机器学习优化

2022 International Conference on Business Analytics for Technology and Security (ICBATS) Pub Date : 2022-02-16 DOI:10.1109/ICBATS54253.2022.9759029

Ema Utami, Suwanto Raharjo, Omar Muhammad Altoumi Alsyaibani, Candra Adipradana

{"title":"基于Bat算法的推特用户情感分类机器学习优化","authors":"Ema Utami, Suwanto Raharjo, Omar Muhammad Altoumi Alsyaibani, Candra Adipradana","doi":"10.1109/ICBATS54253.2022.9759029","DOIUrl":null,"url":null,"abstract":"Social-media is a very effective communication media in today’s digital era. Twitter is one of them which widely used by Internet users. Huge number of tweets has encouraged research in the field of text mining, especially in sentiment analysis. Most of sentiment analysis researches which mined data in Bahasa used TF-IDF to assign weight on every word in corpus. This traditional method resulted low accuracy when tested using machine learning methods. In this study, instead of using TF-IDF, we implemented Bat Algorithm to weight every word in corpus. We tested this on Naïve Bayes, Decision Tree and K-NN methods. The result of this study shows that Naïve Bayes, Decision Tree and K-NN methods which classified data weighted using TF-IDF reached accuracy 33.58%, 32.82% and 33.61%, respectively. Afterwards, words in corpus were weighted using Bat Algorithm and tested using the same methods. The test result shows that Naïve Bayes, Decision Tree and K-NN methods reached 39.01%, 76.63% and 66.15% in respectively. It can be inferred that Bat Algorithm usage for weighting words in corpus improves machine learning algorithms to classify sentiment of Twitter users. Moreover, it can be identified that the biggest improvement occurred in Decision Tree algorithm which increased 43.81% accuracy. On the other hand, improvement in Naïve Bayes algorithm is still minor compared to other machine learning algorithms.","PeriodicalId":289224,"journal":{"name":"2022 International Conference on Business Analytics for Technology and Security (ICBATS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Optimization using Bat Algorithm to Classify Sentiment of Twitter Users\",\"authors\":\"Ema Utami, Suwanto Raharjo, Omar Muhammad Altoumi Alsyaibani, Candra Adipradana\",\"doi\":\"10.1109/ICBATS54253.2022.9759029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social-media is a very effective communication media in today’s digital era. Twitter is one of them which widely used by Internet users. Huge number of tweets has encouraged research in the field of text mining, especially in sentiment analysis. Most of sentiment analysis researches which mined data in Bahasa used TF-IDF to assign weight on every word in corpus. This traditional method resulted low accuracy when tested using machine learning methods. In this study, instead of using TF-IDF, we implemented Bat Algorithm to weight every word in corpus. We tested this on Naïve Bayes, Decision Tree and K-NN methods. The result of this study shows that Naïve Bayes, Decision Tree and K-NN methods which classified data weighted using TF-IDF reached accuracy 33.58%, 32.82% and 33.61%, respectively. Afterwards, words in corpus were weighted using Bat Algorithm and tested using the same methods. The test result shows that Naïve Bayes, Decision Tree and K-NN methods reached 39.01%, 76.63% and 66.15% in respectively. It can be inferred that Bat Algorithm usage for weighting words in corpus improves machine learning algorithms to classify sentiment of Twitter users. Moreover, it can be identified that the biggest improvement occurred in Decision Tree algorithm which increased 43.81% accuracy. On the other hand, improvement in Naïve Bayes algorithm is still minor compared to other machine learning algorithms.\",\"PeriodicalId\":289224,\"journal\":{\"name\":\"2022 International Conference on Business Analytics for Technology and Security (ICBATS)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Business Analytics for Technology and Security (ICBATS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICBATS54253.2022.9759029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Business Analytics for Technology and Security (ICBATS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBATS54253.2022.9759029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在当今的数字时代，社交媒体是一种非常有效的沟通媒体。Twitter就是其中之一，它被互联网用户广泛使用。大量的推文鼓励了文本挖掘领域的研究，特别是在情感分析方面。大多数挖掘马来语数据的情感分析研究都使用TF-IDF对语料库中的每个词分配权重。当使用机器学习方法进行测试时，这种传统方法的准确性较低。在本研究中，我们没有使用TF-IDF，而是使用Bat算法对语料库中的每个词进行加权。我们在Naïve贝叶斯、决策树和K-NN方法上进行了测试。研究结果表明，Naïve使用TF-IDF加权的Bayes、Decision Tree和K-NN方法对数据进行分类，准确率分别达到33.58%、32.82%和33.61%。然后，使用Bat算法对语料库中的词进行加权，并使用相同的方法进行测试。测试结果表明Naïve贝叶斯、决策树和K-NN方法的准确率分别达到了39.01%、76.63%和66.15%。可以推断，使用Bat算法对语料库中的单词进行加权，可以改进机器学习算法对Twitter用户的情绪进行分类。其中，决策树算法的准确率提高幅度最大，提高了43.81%。另一方面，与其他机器学习算法相比，Naïve Bayes算法的改进仍然很小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine Learning Optimization using Bat Algorithm to Classify Sentiment of Twitter Users

Social-media is a very effective communication media in today’s digital era. Twitter is one of them which widely used by Internet users. Huge number of tweets has encouraged research in the field of text mining, especially in sentiment analysis. Most of sentiment analysis researches which mined data in Bahasa used TF-IDF to assign weight on every word in corpus. This traditional method resulted low accuracy when tested using machine learning methods. In this study, instead of using TF-IDF, we implemented Bat Algorithm to weight every word in corpus. We tested this on Naïve Bayes, Decision Tree and K-NN methods. The result of this study shows that Naïve Bayes, Decision Tree and K-NN methods which classified data weighted using TF-IDF reached accuracy 33.58%, 32.82% and 33.61%, respectively. Afterwards, words in corpus were weighted using Bat Algorithm and tested using the same methods. The test result shows that Naïve Bayes, Decision Tree and K-NN methods reached 39.01%, 76.63% and 66.15% in respectively. It can be inferred that Bat Algorithm usage for weighting words in corpus improves machine learning algorithms to classify sentiment of Twitter users. Moreover, it can be identified that the biggest improvement occurred in Decision Tree algorithm which increased 43.81% accuracy. On the other hand, improvement in Naïve Bayes algorithm is still minor compared to other machine learning algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 International Conference on Business Analytics for Technology and Security (ICBATS)

自引率

0.00%

发文量