Analisis Emosi pada Media Sosial Twitter Menggunakan Metode Multinomial Naive Bayes dan Synthetic Minority Oversampling Technique

Fritson Agung Julians Ayomi, Kania Evita Dewi
{"title":"Analisis Emosi pada Media Sosial Twitter Menggunakan Metode Multinomial Naive Bayes dan Synthetic Minority Oversampling Technique","authors":"Fritson Agung Julians Ayomi, Kania Evita Dewi","doi":"10.34010/komputa.v12i2.9454","DOIUrl":null,"url":null,"abstract":"Twitter social media is often used to express one's emotions through tweets. Much research has been conducted on emotional analysis in the social media Twitter. Machine learning is a tool that is widely used to categorize emotions. However, an imbalance in the amount of data between classes is often a problem. So, this research aims to determine the performance of the combined Multinomial Naïve Bayes (MNB) and Synthetic Minority Oversampling Technique (SMOTE) methods for emotional analysis of tweets from the social media Twitter. Each tweet through data preprocessing in this research includes case folding, data cleaning, convert slangword, convert negation, tokenization, stopword removal, and stemming. For feature extraction the n-gram method is used and for feature weighting the term frequency method is used. Testing was carried out using K-Fold Cross Validation. Based on the test results, using SMOTE an average accuracy of 0.65 or 65% was obtained and an average f1-score value of 0.66 or 66%. Meanwhile, without SMOTE, an average accuracy of 0.64 or 64% was obtained and an average f1-score of 0.65 or 65%. Although in this study it can be shown that the results using SMOTE are 1% better in categorizing emotions. However, the results obtained are not optimal, and other methods of data balancing and machine learning still need to be studied.","PeriodicalId":477061,"journal":{"name":"Komputa: Jurnal Ilmiah Komputer dan Informatika","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Komputa: Jurnal Ilmiah Komputer dan Informatika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34010/komputa.v12i2.9454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Twitter social media is often used to express one's emotions through tweets. Much research has been conducted on emotional analysis in the social media Twitter. Machine learning is a tool that is widely used to categorize emotions. However, an imbalance in the amount of data between classes is often a problem. So, this research aims to determine the performance of the combined Multinomial Naïve Bayes (MNB) and Synthetic Minority Oversampling Technique (SMOTE) methods for emotional analysis of tweets from the social media Twitter. Each tweet through data preprocessing in this research includes case folding, data cleaning, convert slangword, convert negation, tokenization, stopword removal, and stemming. For feature extraction the n-gram method is used and for feature weighting the term frequency method is used. Testing was carried out using K-Fold Cross Validation. Based on the test results, using SMOTE an average accuracy of 0.65 or 65% was obtained and an average f1-score value of 0.66 or 66%. Meanwhile, without SMOTE, an average accuracy of 0.64 or 64% was obtained and an average f1-score of 0.65 or 65%. Although in this study it can be shown that the results using SMOTE are 1% better in categorizing emotions. However, the results obtained are not optimal, and other methods of data balancing and machine learning still need to be studied.
Emosi pagada媒体社交推特分析蒙古那坎方法多项式朴素贝叶斯与合成少数派过采样技术
推特社交媒体经常被用来通过推特来表达一个人的情绪。关于社交媒体Twitter上的情绪分析已经进行了很多研究。机器学习是一种广泛用于对情绪进行分类的工具。然而,类之间数据量的不平衡经常是一个问题。因此,本研究旨在确定组合多项式Naïve贝叶斯(MNB)和合成少数派过采样技术(SMOTE)方法对社交媒体Twitter推文进行情感分析的性能。本研究对每条推文进行数据预处理,包括案例折叠、数据清洗、俚语转换、否定转换、标记化、停止词去除和词干提取。特征提取采用n图法,特征加权采用词频法。采用K-Fold交叉验证进行检验。根据测试结果,使用SMOTE获得的平均准确率为0.65或65%,平均f1评分值为0.66或66%。未使用SMOTE时,平均准确率为0.64或64%,平均f1评分为0.65或65%。尽管在这项研究中,可以证明使用SMOTE的结果在分类情绪方面提高了1%。然而,得到的结果并不是最优的,其他的数据平衡和机器学习的方法仍然需要研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信