Contextual Urdu Text Emotion Detection Corpus and Experiments using Deep Learning Approaches

IF 1.7 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal Pub Date : 2023-06-05 DOI:10.14201/adcaij.30128

Muhammad Hamayon Khan Vardag, Ali Saeed, Umer Hayat, Muhammad Farhat Ullah, Naveed Hussain

{"title":"Contextual Urdu Text Emotion Detection Corpus and Experiments using Deep Learning Approaches","authors":"Muhammad Hamayon Khan Vardag, Ali Saeed, Umer Hayat, Muhammad Farhat Ullah, Naveed Hussain","doi":"10.14201/adcaij.30128","DOIUrl":null,"url":null,"abstract":"Textual emotion detection aims to discover human emotions from written text. Textual emotion detection is a significant challenge due to the unavailability of facial and voice expressions. Considerable research has been done to identify textual emotions in high-resource languages such as English, French, Chinese, and others. Despite having over 300 million speakers and large volumes of literature available online, Urdu has not been properly investigated for the textual emotion detection task. To address this gap, this study makes two contributions: (1) the creation of a novel dialog-based corpus for Urdu (Contextual Urdu Text Emotion Detection Corpus). CUTEC contains 30,160 training and 5,509 testing labelled dialogues, where each dialogue consists of three Urdu contextual sentences. In addition, all dialogues are labelled using four emotion classes, i.e., Happy, Sad, Angry, and Other. As a second contribution (2) five deep learning models, i.e., RNN, LSTM, Bi- LSTM, GRU, and Bi-GRU have been trained and tested using CUTEC with different parametric settings. The highest results (Accuracy = 87.28 and F1 = 0.87) are attained using a GRU-based architecture.","PeriodicalId":42597,"journal":{"name":"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal","volume":"82 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14201/adcaij.30128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Textual emotion detection aims to discover human emotions from written text. Textual emotion detection is a significant challenge due to the unavailability of facial and voice expressions. Considerable research has been done to identify textual emotions in high-resource languages such as English, French, Chinese, and others. Despite having over 300 million speakers and large volumes of literature available online, Urdu has not been properly investigated for the textual emotion detection task. To address this gap, this study makes two contributions: (1) the creation of a novel dialog-based corpus for Urdu (Contextual Urdu Text Emotion Detection Corpus). CUTEC contains 30,160 training and 5,509 testing labelled dialogues, where each dialogue consists of three Urdu contextual sentences. In addition, all dialogues are labelled using four emotion classes, i.e., Happy, Sad, Angry, and Other. As a second contribution (2) five deep learning models, i.e., RNN, LSTM, Bi- LSTM, GRU, and Bi-GRU have been trained and tested using CUTEC with different parametric settings. The highest results (Accuracy = 87.28 and F1 = 0.87) are attained using a GRU-based architecture.

查看原文本刊更多论文

情境乌尔都语文本情感检测语料库及深度学习方法实验

文本情感检测旨在从书面文本中发现人类的情感。由于面部和语音表达的不可用性，文本情感检测是一个重大挑战。在英语、法语、汉语等资源丰富的语言中，已经进行了大量的研究来识别文本情感。尽管有超过3亿的使用者和大量的在线文献，乌尔都语还没有被适当地研究用于文本情感检测任务。为了解决这一差距，本研究做出了两个贡献:(1)创建了一个新的基于对话的乌尔都语语料库(语境乌尔都语文本情感检测语料库)。CUTEC包含30,160个训练和5,509个测试标记对话，其中每个对话由三个乌尔都语上下文句子组成。此外，所有对话都使用四种情感类别进行标记，即快乐，悲伤，愤怒和其他。作为第二个贡献(2)五个深度学习模型，即RNN, LSTM, Bi- LSTM, GRU和Bi-GRU使用CUTEC在不同参数设置下进行了训练和测试。使用基于gru的架构可以获得最高的结果(准确率= 87.28,F1 = 0.87)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊