Analisis Sentimen Pada Media Sosial Twitter Menggunakan Naive Bayes Classifier Dengan Ekstrasi Fitur N-Gram

A. Nugroho
{"title":"Analisis Sentimen Pada Media Sosial Twitter Menggunakan Naive Bayes Classifier Dengan Ekstrasi Fitur N-Gram","authors":"A. Nugroho","doi":"10.30645/J-SAKTI.V2I2.83","DOIUrl":null,"url":null,"abstract":"Social media is currently an online media that is widely accessed in the world. Microblogging services such as Twitter allow users to write about various things they experience or write reviews of a product, service, public figures and so on. This can be used to take opinion or sentiment towards an entity that is being discussed on social media such as Twitter. This study utilizes these data to determine public opinion or sentiment regarding public perceptions of the issue of rising electricity tariffs. Opinion taking is based on three classes namely positive, negative and neutral. Users often use non-standard word abbreviations or spelling, this can complicate the process and accuracy of classification results. In this study the authors apply text-preprocessing in handling these problems. For feature extraction, n-gram and classification methods are used using the Naive Bayes classifier. From the results of the research that has been done, the most negative sentiments are formed in response to the issue of the increase in basic electricity tariffs. In addition, from the results of testing with the method of cross validation and confusion matrix it is known that the accuracy of the naïve Bayes method reaches 89.67% before applying n-gram, and the accuracy rate increases 2.33% after applying n-gram characters to 92.00%. It is proven that the application of the n-gram extraction feature can increase the accuracy of the naïve Bayes method.","PeriodicalId":402811,"journal":{"name":"J-SAKTI (Jurnal Sains Komputer dan Informatika)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J-SAKTI (Jurnal Sains Komputer dan Informatika)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30645/J-SAKTI.V2I2.83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

Social media is currently an online media that is widely accessed in the world. Microblogging services such as Twitter allow users to write about various things they experience or write reviews of a product, service, public figures and so on. This can be used to take opinion or sentiment towards an entity that is being discussed on social media such as Twitter. This study utilizes these data to determine public opinion or sentiment regarding public perceptions of the issue of rising electricity tariffs. Opinion taking is based on three classes namely positive, negative and neutral. Users often use non-standard word abbreviations or spelling, this can complicate the process and accuracy of classification results. In this study the authors apply text-preprocessing in handling these problems. For feature extraction, n-gram and classification methods are used using the Naive Bayes classifier. From the results of the research that has been done, the most negative sentiments are formed in response to the issue of the increase in basic electricity tariffs. In addition, from the results of testing with the method of cross validation and confusion matrix it is known that the accuracy of the naïve Bayes method reaches 89.67% before applying n-gram, and the accuracy rate increases 2.33% after applying n-gram characters to 92.00%. It is proven that the application of the n-gram extraction feature can increase the accuracy of the naïve Bayes method.
Twitter社交媒体上的情感分析使用了带有n克特征的天真的Bayes Classifier
社交媒体是目前在世界范围内广泛使用的在线媒体。像Twitter这样的微博服务允许用户写下他们经历的各种事情,或者写对产品、服务、公众人物等的评论。这可以用来表达对Twitter等社交媒体上正在讨论的实体的意见或情绪。本研究利用这些数据来确定公众对电价上涨问题的看法或情绪。意见的获取基于积极、消极和中立三种类型。用户经常使用非标准的单词缩写或拼写,这会使分类结果的过程和准确性复杂化。在本研究中,作者应用文本预处理来处理这些问题。对于特征提取,使用朴素贝叶斯分类器使用n-gram和分类方法。从已经完成的研究结果来看,最负面的情绪是对基本电价上涨问题的反应。此外,通过交叉验证法和混淆矩阵法的测试结果可知,naïve贝叶斯方法在应用n-gram字符前准确率达到89.67%,应用n-gram字符后准确率提高2.33%,达到92.00%。实验证明,n-gram提取特征的应用可以提高naïve贝叶斯方法的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信