A Performance Evaluation of Sentiment Classification Applying SVM, KNN, and Naive Bayes

2021 International Conference on Computing, Networking, Telecommunications & Engineering Sciences Applications (CoNTESA) Pub Date : 2021-12-09 DOI:10.1109/contesa52813.2021.9657115

Md Deloar Hossan Jasy, Sakib Al Hasan, Md Ibrahim Khalil Sagor, Abdullah M. Noman, J. Ji

{"title":"A Performance Evaluation of Sentiment Classification Applying SVM, KNN, and Naive Bayes","authors":"Md Deloar Hossan Jasy, Sakib Al Hasan, Md Ibrahim Khalil Sagor, Abdullah M. Noman, J. Ji","doi":"10.1109/contesa52813.2021.9657115","DOIUrl":null,"url":null,"abstract":"The rising use of the internet and social networks has opened up new avenues for individuals to express themselves. It’s also a platform with a plethora of information where an individual can see other people’s thoughts, which are diverged into numerous sentiment categories and are slowly becoming a primary part of the decision. This study makes a significant contribution to sentiment classification, which is effective in determining data in a big amount of tweets with de-contextualized sentiments which are often positive or negative, or in the middle. To accomplish this, we initially pre-processed the raw data, and then draw out the meaningful words and phrases (characteristic vector), then picked the characteristic vector list, and then applied machine-learning classification methods including Naive Bayes, KNN, and SVM. And at last, we assessed the classifier’s performance using the terms recall, accuracy, and precision, as well as the F1-score. Support Vector Machine has the highest accuracy of 92 percent, followed by KNN and Naive Bayes with 88 and 85 percent accuracy, respectively.","PeriodicalId":323624,"journal":{"name":"2021 International Conference on Computing, Networking, Telecommunications & Engineering Sciences Applications (CoNTESA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computing, Networking, Telecommunications & Engineering Sciences Applications (CoNTESA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/contesa52813.2021.9657115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The rising use of the internet and social networks has opened up new avenues for individuals to express themselves. It’s also a platform with a plethora of information where an individual can see other people’s thoughts, which are diverged into numerous sentiment categories and are slowly becoming a primary part of the decision. This study makes a significant contribution to sentiment classification, which is effective in determining data in a big amount of tweets with de-contextualized sentiments which are often positive or negative, or in the middle. To accomplish this, we initially pre-processed the raw data, and then draw out the meaningful words and phrases (characteristic vector), then picked the characteristic vector list, and then applied machine-learning classification methods including Naive Bayes, KNN, and SVM. And at last, we assessed the classifier’s performance using the terms recall, accuracy, and precision, as well as the F1-score. Support Vector Machine has the highest accuracy of 92 percent, followed by KNN and Naive Bayes with 88 and 85 percent accuracy, respectively.

查看原文本刊更多论文

基于支持向量机、KNN和朴素贝叶斯的情感分类性能评价

互联网和社交网络的日益普及为个人表达自己开辟了新的途径。它也是一个拥有大量信息的平台，个人可以看到其他人的想法，这些想法分为许多情绪类别，并逐渐成为决策的主要部分。本研究对情绪分类做出了重大贡献，该分类可以有效地确定大量具有非情境化情绪的推文中的数据，这些情绪通常是积极的或消极的，或者处于中间状态。为此，我们首先对原始数据进行预处理，然后提取有意义的词和短语(特征向量)，然后选择特征向量列表，然后应用朴素贝叶斯、KNN和SVM等机器学习分类方法。最后，我们使用召回率、准确性和精度以及f1分数来评估分类器的性能。支持向量机的准确率最高，为92%，其次是KNN和朴素贝叶斯，分别为88%和85%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 International Conference on Computing, Networking, Telecommunications & Engineering Sciences Applications (CoNTESA)

自引率

0.00%

发文量