Emotion Recognition in Reddit Comments Using Recurrent Neural Networks

Q3 Computer Science
Mahdi Rezapour
{"title":"Emotion Recognition in Reddit Comments Using Recurrent Neural\nNetworks","authors":"Mahdi Rezapour","doi":"10.2174/0126662558273325231201051141","DOIUrl":null,"url":null,"abstract":"\n\nReddit comments are a valuable source of natural language data\nwhere emotion plays a key role in human communication. However, emotion recognition is a\ndifficult task that requires understanding the context and sentiment of the texts. In this paper,\nwe aim to compare the effectiveness of four recurrent neural network (RNN) models for classifying the emotions of Reddit comments.\n\n\n\nWe use a small dataset of 4,922 comments labeled with four emotions: approval,\ndisapproval, love, and annoyance. We also use pre-trained Glove.840B.300d embeddings as\nthe input representation for all models. The models we compare are SimpleRNN, Long ShortTerm Memory (LSTM), bidirectional LSTM, and Gated Recurrent Unit (GRU). We experiment with different text preprocessing steps, such as removing stopwords and applying stemming, removing negation from stopwords, and the effect of setting the embedding layer as\ntrainable on the models.\n\n\n\nWe find that GRU outperforms all other models, achieving an accuracy of 74%. Bidirectional LSTM and LSTM are close behind, while SimpleRNN performs the worst. We observe that the low accuracy is likely due to the presence of sarcasm, irony, and complexity in\nthe texts. We also notice that setting the embedding layer as trainable improves the performance of LSTM but increases the computational cost and training time significantly. We analyze some examples of misclassified texts by GRU and identify the challenges and limitations\nof the dataset and the models\n\n\n\nIn our study GRU was found to be the best model for emotion classification of\nReddit comments among the four RNN models we compared. We also discuss some future directions for research to improve the emotion recognition task on Reddit comments. Furthermore, we provide an extensive discussion of the applications and methods behind each technique in the context of the paper.\n","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":"49 44","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0126662558273325231201051141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Reddit comments are a valuable source of natural language data where emotion plays a key role in human communication. However, emotion recognition is a difficult task that requires understanding the context and sentiment of the texts. In this paper, we aim to compare the effectiveness of four recurrent neural network (RNN) models for classifying the emotions of Reddit comments. We use a small dataset of 4,922 comments labeled with four emotions: approval, disapproval, love, and annoyance. We also use pre-trained Glove.840B.300d embeddings as the input representation for all models. The models we compare are SimpleRNN, Long ShortTerm Memory (LSTM), bidirectional LSTM, and Gated Recurrent Unit (GRU). We experiment with different text preprocessing steps, such as removing stopwords and applying stemming, removing negation from stopwords, and the effect of setting the embedding layer as trainable on the models. We find that GRU outperforms all other models, achieving an accuracy of 74%. Bidirectional LSTM and LSTM are close behind, while SimpleRNN performs the worst. We observe that the low accuracy is likely due to the presence of sarcasm, irony, and complexity in the texts. We also notice that setting the embedding layer as trainable improves the performance of LSTM but increases the computational cost and training time significantly. We analyze some examples of misclassified texts by GRU and identify the challenges and limitations of the dataset and the models In our study GRU was found to be the best model for emotion classification of Reddit comments among the four RNN models we compared. We also discuss some future directions for research to improve the emotion recognition task on Reddit comments. Furthermore, we provide an extensive discussion of the applications and methods behind each technique in the context of the paper.
利用递归神经网络识别 Reddit 评论中的情绪
Reddit 评论是一个宝贵的自然语言数据源,其中情感在人类交流中扮演着重要角色。然而,情感识别是一项艰巨的任务,需要理解文本的上下文和情感。本文旨在比较四种递归神经网络(RNN)模型对 Reddit 评论进行情感分类的效果。我们使用了一个由 4922 条评论组成的小型数据集,其中标注了四种情感:赞同、不赞同、爱和烦恼。我们还使用预训练的 Glove.840B.300d 嵌入作为所有模型的输入表示。我们比较的模型包括 SimpleRNN、Long ShortTerm Memory (LSTM)、bidirectional LSTM 和 Gated Recurrent Unit (GRU)。我们尝试了不同的文本预处理步骤,如去除停顿词和应用词干、去除停顿词中的否定,以及将嵌入层设置为可训练层对模型的影响。双向 LSTM 和 LSTM 紧随其后,而 SimpleRNN 的表现最差。我们发现,准确率低的原因可能是文本中存在讽刺、反讽和复杂性。我们还注意到,将嵌入层设置为可训练层可以提高 LSTM 的性能,但会大大增加计算成本和训练时间。我们分析了 GRU 错误分类文本的一些示例,并指出了数据集和模型所面临的挑战和局限性。在我们的研究中,我们发现 GRU 是四种 RNN 模型中对 Reddit 评论进行情感分类的最佳模型。我们还讨论了改进 Reddit 评论情感识别任务的一些未来研究方向。此外,我们还结合本文对每种技术背后的应用和方法进行了广泛的讨论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Recent Advances in Computer Science and Communications
Recent Advances in Computer Science and Communications Computer Science-Computer Science (all)
CiteScore
2.50
自引率
0.00%
发文量
142
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信