Suboh Alkhushayni, Daniel C Zellmer, Ryan J DeBusk, Du’a Al-zaleq
{"title":"Text emotion mining on Twitter","authors":"Suboh Alkhushayni, Daniel C Zellmer, Ryan J DeBusk, Du’a Al-zaleq","doi":"10.1088/2633-1357/abc01e","DOIUrl":null,"url":null,"abstract":"Twitter has become a medium through which a substantial percentage of the global population communicates their feelings and reactions to current events. Emotion mining from text aims to capture these emotions by using a series of algorithms to evaluate the contents of each tweet. In this study, tweets that expressed at least one of seven basic emotions were collected. The resulting dataset was a corpus of 42,000 tweets with a balanced presence of each emotion. From this corpus a lexicon of roughly 40,000 words, each associated with a weighted vector corresponding to one of the emotions, was created. Next, different methods of identifying emotion in these ‘cleaned’ tweets were performed and evaluated. These methods included both lexically-based classification and supervised machine learning-based classification. Finally, an ensemble method involving several multi-class classifiers trained on unigram features of the lexicon was evaluated. This evaluation revealed that the ensemble method outperformed all other tested methods when tested on existing datasets as well as on the dataset created for this study.","PeriodicalId":93771,"journal":{"name":"IOP SciNotes","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IOP SciNotes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2633-1357/abc01e","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Twitter has become a medium through which a substantial percentage of the global population communicates their feelings and reactions to current events. Emotion mining from text aims to capture these emotions by using a series of algorithms to evaluate the contents of each tweet. In this study, tweets that expressed at least one of seven basic emotions were collected. The resulting dataset was a corpus of 42,000 tweets with a balanced presence of each emotion. From this corpus a lexicon of roughly 40,000 words, each associated with a weighted vector corresponding to one of the emotions, was created. Next, different methods of identifying emotion in these ‘cleaned’ tweets were performed and evaluated. These methods included both lexically-based classification and supervised machine learning-based classification. Finally, an ensemble method involving several multi-class classifiers trained on unigram features of the lexicon was evaluated. This evaluation revealed that the ensemble method outperformed all other tested methods when tested on existing datasets as well as on the dataset created for this study.