Combining SentiStrength and Multilayer Perceptron in Twitter Sentiment Classification

Eko Yudhi Prastowo, Endroyono, E. M. Yuniarno
{"title":"Combining SentiStrength and Multilayer Perceptron in Twitter Sentiment Classification","authors":"Eko Yudhi Prastowo, Endroyono, E. M. Yuniarno","doi":"10.1109/ISITIA.2019.8937134","DOIUrl":null,"url":null,"abstract":"The advancement of internet technology has caused the use of social media to become the people lifestyle. The company and the government use social media as instant feedback to get user sentiments regarding their comments or reviews. The sentiment is an opinion or view that based on excessive feelings towards something. The method for knowing positive or negative sentiments from someone’s comments can be done manually by humans to analyze comments one by one or automatically by machine learning to do classifications. Machine learning requires training data and test data that have positive and negative labels. Generally, data labeling is done manually by humans. In this study, we used machine learning to classify sentiments with data collected from Twitter. Machine learning method used is Multilayer Perceptron and Naive Bayes as a comparison. Labeling dataset using manual method. For addition training data, labeled data was generated using an English lexicon dictionary called SentiStrength. Feature extraction uses vectorization and TF-IDF. This study aims to measure effect of adding training data generated using SentiStrength from unlabeled data during learning process to accuracy of machine learning model. Classification model testing uses data of 627 tweets. The result is addition of training data to increase average accuracy by 5% of initial accuracy. Multilayer Perceptron is more accurate than Naive Bayes with the highest accuracy ratio of 77.71% and 76.07%.","PeriodicalId":153870,"journal":{"name":"2019 International Seminar on Intelligent Technology and Its Applications (ISITIA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Seminar on Intelligent Technology and Its Applications (ISITIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISITIA.2019.8937134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The advancement of internet technology has caused the use of social media to become the people lifestyle. The company and the government use social media as instant feedback to get user sentiments regarding their comments or reviews. The sentiment is an opinion or view that based on excessive feelings towards something. The method for knowing positive or negative sentiments from someone’s comments can be done manually by humans to analyze comments one by one or automatically by machine learning to do classifications. Machine learning requires training data and test data that have positive and negative labels. Generally, data labeling is done manually by humans. In this study, we used machine learning to classify sentiments with data collected from Twitter. Machine learning method used is Multilayer Perceptron and Naive Bayes as a comparison. Labeling dataset using manual method. For addition training data, labeled data was generated using an English lexicon dictionary called SentiStrength. Feature extraction uses vectorization and TF-IDF. This study aims to measure effect of adding training data generated using SentiStrength from unlabeled data during learning process to accuracy of machine learning model. Classification model testing uses data of 627 tweets. The result is addition of training data to increase average accuracy by 5% of initial accuracy. Multilayer Perceptron is more accurate than Naive Bayes with the highest accuracy ratio of 77.71% and 76.07%.
结合SentiStrength和多层感知机在Twitter情感分类中的应用
互联网技术的进步使得使用社交媒体成为人们的生活方式。该公司和政府利用社交媒体作为即时反馈,以获取用户对其评论或评论的看法。这种情绪是基于对某事过度的感情而产生的一种意见或观点。从某人的评论中了解积极或消极情绪的方法可以由人类手动完成,逐个分析评论,也可以通过机器学习自动进行分类。机器学习需要有正面和负面标签的训练数据和测试数据。通常,数据标记是由人类手动完成的。在这项研究中,我们使用机器学习对从Twitter收集的数据进行情绪分类。使用的机器学习方法是多层感知机和朴素贝叶斯作为比较。使用手动方法标记数据集。对于附加训练数据,使用名为SentiStrength的英语词典生成标记数据。特征提取采用矢量化和TF-IDF。本研究旨在衡量在学习过程中使用SentiStrength从未标记数据中添加训练数据对机器学习模型准确性的影响。分类模型测试使用627条tweets的数据。结果是增加了训练数据,使平均精度提高了初始精度的5%。多层感知器的准确率最高,分别为77.71%和76.07%,优于朴素贝叶斯。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信