Feature Selection Based Naïve Bayes Algorithm for Twitter Sentiment Analysis

Aruna T.M, A. K, Divyaraj G N, P. Pareek
{"title":"Feature Selection Based Naïve Bayes Algorithm for Twitter Sentiment Analysis","authors":"Aruna T.M, A. K, Divyaraj G N, P. Pareek","doi":"10.1109/ICERECT56837.2022.10060604","DOIUrl":null,"url":null,"abstract":"A intriguing areas of study now is Twitter sentiment investigation. It fuses the data mining methods used to create such systems with natural language processing techniques. The majority of currently available methods for analyzing Twitter sentiment do not do well when presented with messages that are both brief and ambiguous, since this is the only kind of information they take into account. Better classification outcomes are guaranteed if the right collection of features is used to identify emotion in online textual material. The computational difficulty of doing optimal feature selection drives the need for the development of creative approaches to enhancing classifier performance. In this work, an effective method for analyzing Twitter user sentiment was presented. A machine learning model was developed by the suggested method to identify good and negative tweets. Characteristics are extracted after pre-processing, and it's possible that include irrelevant features can lower classification accuracy. For this reason, the model implemented the Vortex Search Algorithm (VSA) to choose the best characteristics and discard the rest. The final tweets categorization is done using the Naive Bayes (NB) algorithm. A total of four Twitter datasets are used for the studies, each measuring a unique set of parameters and made accessible to the public. Marketing, detecting political polarization, and product reviews are just a few of the many areas that may benefit from the suggested system's ability to gauge user sentiment based on tweets.","PeriodicalId":205485,"journal":{"name":"2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICERECT56837.2022.10060604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A intriguing areas of study now is Twitter sentiment investigation. It fuses the data mining methods used to create such systems with natural language processing techniques. The majority of currently available methods for analyzing Twitter sentiment do not do well when presented with messages that are both brief and ambiguous, since this is the only kind of information they take into account. Better classification outcomes are guaranteed if the right collection of features is used to identify emotion in online textual material. The computational difficulty of doing optimal feature selection drives the need for the development of creative approaches to enhancing classifier performance. In this work, an effective method for analyzing Twitter user sentiment was presented. A machine learning model was developed by the suggested method to identify good and negative tweets. Characteristics are extracted after pre-processing, and it's possible that include irrelevant features can lower classification accuracy. For this reason, the model implemented the Vortex Search Algorithm (VSA) to choose the best characteristics and discard the rest. The final tweets categorization is done using the Naive Bayes (NB) algorithm. A total of four Twitter datasets are used for the studies, each measuring a unique set of parameters and made accessible to the public. Marketing, detecting political polarization, and product reviews are just a few of the many areas that may benefit from the suggested system's ability to gauge user sentiment based on tweets.
基于特征选择Naïve的推特情感分析贝叶斯算法
现在一个有趣的研究领域是Twitter情绪调查。它融合了用于创建此类系统的数据挖掘方法和自然语言处理技术。目前大多数可用的分析Twitter情绪的方法在呈现简短而模糊的消息时表现不佳,因为这是他们考虑的唯一一种信息。如果使用正确的特征集合来识别在线文本材料中的情感,则可以保证更好的分类结果。进行最优特征选择的计算难度促使需要开发创造性的方法来增强分类器的性能。本文提出了一种有效的Twitter用户情感分析方法。根据建议的方法开发了一个机器学习模型来识别好的和负面的推文。特征是经过预处理后提取出来的,包含不相关的特征可能会降低分类准确率。为此,该模型采用了涡搜索算法(Vortex Search Algorithm, VSA)来选择最好的特征并丢弃其余的特征。最后的tweet分类使用朴素贝叶斯(NB)算法完成。研究共使用了四个Twitter数据集,每个数据集测量一组独特的参数,并向公众开放。市场营销、检测政治两极分化和产品评论只是许多领域中的一小部分,这些领域可能会受益于该建议系统基于tweet判断用户情绪的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信