Emotion Recognition in Reddit Comments Using Recurrent Neural Networks

Q3 Computer Science

Recent Advances in Computer Science and Communications Pub Date : 2023-12-15 DOI:10.2174/0126662558273325231201051141

Mahdi Rezapour

{"title":"Emotion Recognition in Reddit Comments Using Recurrent Neural\nNetworks","authors":"Mahdi Rezapour","doi":"10.2174/0126662558273325231201051141","DOIUrl":null,"url":null,"abstract":"\n\nReddit comments are a valuable source of natural language data\nwhere emotion plays a key role in human communication. However, emotion recognition is a\ndifficult task that requires understanding the context and sentiment of the texts. In this paper,\nwe aim to compare the effectiveness of four recurrent neural network (RNN) models for classifying the emotions of Reddit comments.\n\n\n\nWe use a small dataset of 4,922 comments labeled with four emotions: approval,\ndisapproval, love, and annoyance. We also use pre-trained Glove.840B.300d embeddings as\nthe input representation for all models. The models we compare are SimpleRNN, Long ShortTerm Memory (LSTM), bidirectional LSTM, and Gated Recurrent Unit (GRU). We experiment with different text preprocessing steps, such as removing stopwords and applying stemming, removing negation from stopwords, and the effect of setting the embedding layer as\ntrainable on the models.\n\n\n\nWe find that GRU outperforms all other models, achieving an accuracy of 74%. Bidirectional LSTM and LSTM are close behind, while SimpleRNN performs the worst. We observe that the low accuracy is likely due to the presence of sarcasm, irony, and complexity in\nthe texts. We also notice that setting the embedding layer as trainable improves the performance of LSTM but increases the computational cost and training time significantly. We analyze some examples of misclassified texts by GRU and identify the challenges and limitations\nof the dataset and the models\n\n\n\nIn our study GRU was found to be the best model for emotion classification of\nReddit comments among the four RNN models we compared. We also discuss some future directions for research to improve the emotion recognition task on Reddit comments. Furthermore, we provide an extensive discussion of the applications and methods behind each technique in the context of the paper.\n","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":"49 44","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0126662558273325231201051141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Reddit comments are a valuable source of natural language data where emotion plays a key role in human communication. However, emotion recognition is a difficult task that requires understanding the context and sentiment of the texts. In this paper, we aim to compare the effectiveness of four recurrent neural network (RNN) models for classifying the emotions of Reddit comments. We use a small dataset of 4,922 comments labeled with four emotions: approval, disapproval, love, and annoyance. We also use pre-trained Glove.840B.300d embeddings as the input representation for all models. The models we compare are SimpleRNN, Long ShortTerm Memory (LSTM), bidirectional LSTM, and Gated Recurrent Unit (GRU). We experiment with different text preprocessing steps, such as removing stopwords and applying stemming, removing negation from stopwords, and the effect of setting the embedding layer as trainable on the models. We find that GRU outperforms all other models, achieving an accuracy of 74%. Bidirectional LSTM and LSTM are close behind, while SimpleRNN performs the worst. We observe that the low accuracy is likely due to the presence of sarcasm, irony, and complexity in the texts. We also notice that setting the embedding layer as trainable improves the performance of LSTM but increases the computational cost and training time significantly. We analyze some examples of misclassified texts by GRU and identify the challenges and limitations of the dataset and the models In our study GRU was found to be the best model for emotion classification of Reddit comments among the four RNN models we compared. We also discuss some future directions for research to improve the emotion recognition task on Reddit comments. Furthermore, we provide an extensive discussion of the applications and methods behind each technique in the context of the paper.

查看原文本刊更多论文

利用递归神经网络识别 Reddit 评论中的情绪

Reddit 评论是一个宝贵的自然语言数据源，其中情感在人类交流中扮演着重要角色。然而，情感识别是一项艰巨的任务，需要理解文本的上下文和情感。本文旨在比较四种递归神经网络（RNN）模型对 Reddit 评论进行情感分类的效果。我们使用了一个由 4922 条评论组成的小型数据集，其中标注了四种情感：赞同、不赞同、爱和烦恼。我们还使用预训练的 Glove.840B.300d 嵌入作为所有模型的输入表示。我们比较的模型包括 SimpleRNN、Long ShortTerm Memory (LSTM)、bidirectional LSTM 和 Gated Recurrent Unit (GRU)。我们尝试了不同的文本预处理步骤，如去除停顿词和应用词干、去除停顿词中的否定，以及将嵌入层设置为可训练层对模型的影响。双向 LSTM 和 LSTM 紧随其后，而 SimpleRNN 的表现最差。我们发现，准确率低的原因可能是文本中存在讽刺、反讽和复杂性。我们还注意到，将嵌入层设置为可训练层可以提高 LSTM 的性能，但会大大增加计算成本和训练时间。我们分析了 GRU 错误分类文本的一些示例，并指出了数据集和模型所面临的挑战和局限性。在我们的研究中，我们发现 GRU 是四种 RNN 模型中对 Reddit 评论进行情感分类的最佳模型。我们还讨论了改进 Reddit 评论情感识别任务的一些未来研究方向。此外，我们还结合本文对每种技术背后的应用和方法进行了广泛的讨论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Recent Advances in Computer Science and Communications Computer Science-Computer Science (all)

CiteScore

2.50

自引率

0.00%

发文量

142