Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets

Zekeriya Anil Guven, B. Diri, Tolgahan Cakaloglu
{"title":"Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets","authors":"Zekeriya Anil Guven, B. Diri, Tolgahan Cakaloglu","doi":"10.21541/apjes.459447","DOIUrl":null,"url":null,"abstract":"Understanding the reason behind the emotions placed in the social media plays a key role to learn mood characterization of any written texts that are not seen before. Knowing how to classify the mood characterization leads this technology to be useful in a variety of fields. The Latent Dirichlet Allocation (LDA), a topic modeling algorithm, was used to determine which emotions the tweets on Twitter had in the study. The dataset consists of 4000 tweets that are categorized into 5 different emotions that are anger, fear, happiness, sadness, and surprise. Zemberek, Snowball, and first 5 letters root extraction methods are used to create models. The generated models were tested by using the proposed n-stage LDA method. With the proposed method, we aimed to increase model’s success rate by decreasing the number of words in the dictionary. By using the multi-stages LDA, we were able to perform better (2-stages:70.5%, 3-stages:76.4%) than the state of the art result (60.4%) which was achieved using the plain LDA for 5 classes.","PeriodicalId":294830,"journal":{"name":"Academic Platform Journal of Engineering and Science","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Academic Platform Journal of Engineering and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21541/apjes.459447","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Understanding the reason behind the emotions placed in the social media plays a key role to learn mood characterization of any written texts that are not seen before. Knowing how to classify the mood characterization leads this technology to be useful in a variety of fields. The Latent Dirichlet Allocation (LDA), a topic modeling algorithm, was used to determine which emotions the tweets on Twitter had in the study. The dataset consists of 4000 tweets that are categorized into 5 different emotions that are anger, fear, happiness, sadness, and surprise. Zemberek, Snowball, and first 5 letters root extraction methods are used to create models. The generated models were tested by using the proposed n-stage LDA method. With the proposed method, we aimed to increase model’s success rate by decreasing the number of words in the dictionary. By using the multi-stages LDA, we were able to perform better (2-stages:70.5%, 3-stages:76.4%) than the state of the art result (60.4%) which was achieved using the plain LDA for 5 classes.
基于n阶段潜在Dirichlet分配的土耳其语推文情感检测
了解社交媒体中情绪背后的原因,对于学习任何从未见过的书面文本的情绪特征起着关键作用。知道如何对情绪特征进行分类使得这项技术在许多领域都很有用。潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)是一种主题建模算法,用于确定推特上的推文在研究中具有哪些情绪。该数据集由4000条推文组成,这些推文被分为5种不同的情绪,分别是愤怒、恐惧、快乐、悲伤和惊讶。使用Zemberek, Snowball和前5个字母根提取方法创建模型。利用所提出的n-stage LDA方法对生成的模型进行检验。在提出的方法中,我们的目标是通过减少字典中的单词数量来提高模型的成功率。通过使用多阶段LDA,我们能够比使用5个类别的普通LDA获得的最新结果(60.4%)表现更好(2阶段:70.5%,3阶段:76.4%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信