A text mining application of emotion classifications of Twitter's users using Naïve Bayes method

2015 1st International Conference on Wireless and Telematics (ICWT) Pub Date : 2015-11-01 DOI:10.1109/ICWT.2015.7449218

Liza Wikarsa, Sherly Novianti Thahir

{"title":"A text mining application of emotion classifications of Twitter's users using Naïve Bayes method","authors":"Liza Wikarsa, Sherly Novianti Thahir","doi":"10.1109/ICWT.2015.7449218","DOIUrl":null,"url":null,"abstract":"Twitter is one of social media with more than 500 million users and 400 million tweets per day. In any written tweet of Twitter users it contains various emotions. Most research on the use of social media classifies sentiments into three categories that are positive, negative, and neutral. However, none of these studies has developed an application that can detect user emotions in the social media, particularly on Twitter. Hence, this research developed a text mining application to detect emotions of Twitter users that are classified into six emotions, namely happiness, sadness, anger, disgust, fear, and surprise. Three main phases of the text mining utilized in this application were preprocessing, processing, and validation. Activities conducted in the preprocessing phase were case folding, cleansing, stop-word removal, emoticons conversion, negation conversion, and tokenization to the training data and the test data based on the sentiment analysis that performed morphological analysis to build several models. In the processing phase, it performed weighting and classification using the Naive Bayes algorithm on the validated model. The process for measuring the level of accuracy generated by the application using 10-fold cross validation was done in the validation phase. The findings showed that this application is able to achieve 83% accuracy for 105 tweets. In order to get a higher accuracy, one requires a better model in training data.","PeriodicalId":371814,"journal":{"name":"2015 1st International Conference on Wireless and Telematics (ICWT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 1st International Conference on Wireless and Telematics (ICWT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWT.2015.7449218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 51

Abstract

Twitter is one of social media with more than 500 million users and 400 million tweets per day. In any written tweet of Twitter users it contains various emotions. Most research on the use of social media classifies sentiments into three categories that are positive, negative, and neutral. However, none of these studies has developed an application that can detect user emotions in the social media, particularly on Twitter. Hence, this research developed a text mining application to detect emotions of Twitter users that are classified into six emotions, namely happiness, sadness, anger, disgust, fear, and surprise. Three main phases of the text mining utilized in this application were preprocessing, processing, and validation. Activities conducted in the preprocessing phase were case folding, cleansing, stop-word removal, emoticons conversion, negation conversion, and tokenization to the training data and the test data based on the sentiment analysis that performed morphological analysis to build several models. In the processing phase, it performed weighting and classification using the Naive Bayes algorithm on the validated model. The process for measuring the level of accuracy generated by the application using 10-fold cross validation was done in the validation phase. The findings showed that this application is able to achieve 83% accuracy for 105 tweets. In order to get a higher accuracy, one requires a better model in training data.

查看原文本刊更多论文

使用Naïve贝叶斯方法对Twitter用户进行情感分类的文本挖掘应用

推特是社交媒体之一，拥有超过5亿用户，每天有4亿条推文。在推特用户的任何书面推文中，都包含着各种各样的情绪。大多数关于社交媒体使用的研究将情绪分为积极、消极和中性三类。然而，这些研究都没有开发出一种可以检测社交媒体上用户情绪的应用程序，尤其是在Twitter上。因此，本研究开发了一个文本挖掘应用程序来检测Twitter用户的情绪，并将其分为六种情绪，即快乐、悲伤、愤怒、厌恶、恐惧和惊讶。本应用程序中使用的文本挖掘的三个主要阶段是预处理、处理和验证。预处理阶段对训练数据和测试数据进行案例折叠、清理、停词去除、表情符号转换、否定转换和标记化，基于情感分析进行形态学分析，构建多个模型。在处理阶段，使用朴素贝叶斯算法对验证模型进行加权和分类。在验证阶段，使用10倍交叉验证来测量应用程序生成的准确度级别。结果表明，该应用程序能够在105条tweet中达到83%的准确率。为了获得更高的准确率，需要在训练数据中有一个更好的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 1st International Conference on Wireless and Telematics (ICWT)

自引率

0.00%

发文量