情境乌尔都语文本情感检测语料库及深度学习方法实验

IF 1.7 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Muhammad Hamayon Khan Vardag, Ali Saeed, Umer Hayat, Muhammad Farhat Ullah, Naveed Hussain
{"title":"情境乌尔都语文本情感检测语料库及深度学习方法实验","authors":"Muhammad Hamayon Khan Vardag, Ali Saeed, Umer Hayat, Muhammad Farhat Ullah, Naveed Hussain","doi":"10.14201/adcaij.30128","DOIUrl":null,"url":null,"abstract":"Textual emotion detection aims to discover human emotions from written text. Textual emotion detection is a significant challenge due to the unavailability of facial and voice expressions. Considerable research has been done to identify textual emotions in high-resource languages such as English, French, Chinese, and others. Despite having over 300 million speakers and large volumes of literature available online, Urdu has not been properly investigated for the textual emotion detection task. To address this gap, this study makes two contributions: (1) the creation of a novel dialog-based corpus for Urdu (Contextual Urdu Text Emotion Detection Corpus). CUTEC contains 30,160 training and 5,509 testing labelled dialogues, where each dialogue consists of three Urdu contextual sentences. In addition, all dialogues are labelled using four emotion classes, i.e., Happy, Sad, Angry, and Other. As a second contribution (2) five deep learning models, i.e., RNN, LSTM, Bi- LSTM, GRU, and Bi-GRU have been trained and tested using CUTEC with different parametric settings. The highest results (Accuracy = 87.28 and F1 = 0.87) are attained using a GRU-based architecture.","PeriodicalId":42597,"journal":{"name":"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal","volume":"82 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contextual Urdu Text Emotion Detection Corpus and Experiments using Deep Learning Approaches\",\"authors\":\"Muhammad Hamayon Khan Vardag, Ali Saeed, Umer Hayat, Muhammad Farhat Ullah, Naveed Hussain\",\"doi\":\"10.14201/adcaij.30128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Textual emotion detection aims to discover human emotions from written text. Textual emotion detection is a significant challenge due to the unavailability of facial and voice expressions. Considerable research has been done to identify textual emotions in high-resource languages such as English, French, Chinese, and others. Despite having over 300 million speakers and large volumes of literature available online, Urdu has not been properly investigated for the textual emotion detection task. To address this gap, this study makes two contributions: (1) the creation of a novel dialog-based corpus for Urdu (Contextual Urdu Text Emotion Detection Corpus). CUTEC contains 30,160 training and 5,509 testing labelled dialogues, where each dialogue consists of three Urdu contextual sentences. In addition, all dialogues are labelled using four emotion classes, i.e., Happy, Sad, Angry, and Other. As a second contribution (2) five deep learning models, i.e., RNN, LSTM, Bi- LSTM, GRU, and Bi-GRU have been trained and tested using CUTEC with different parametric settings. The highest results (Accuracy = 87.28 and F1 = 0.87) are attained using a GRU-based architecture.\",\"PeriodicalId\":42597,\"journal\":{\"name\":\"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal\",\"volume\":\"82 1\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14201/adcaij.30128\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14201/adcaij.30128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

文本情感检测旨在从书面文本中发现人类的情感。由于面部和语音表达的不可用性,文本情感检测是一个重大挑战。在英语、法语、汉语等资源丰富的语言中,已经进行了大量的研究来识别文本情感。尽管有超过3亿的使用者和大量的在线文献,乌尔都语还没有被适当地研究用于文本情感检测任务。为了解决这一差距,本研究做出了两个贡献:(1)创建了一个新的基于对话的乌尔都语语料库(语境乌尔都语文本情感检测语料库)。CUTEC包含30,160个训练和5,509个测试标记对话,其中每个对话由三个乌尔都语上下文句子组成。此外,所有对话都使用四种情感类别进行标记,即快乐,悲伤,愤怒和其他。作为第二个贡献(2)五个深度学习模型,即RNN, LSTM, Bi- LSTM, GRU和Bi-GRU使用CUTEC在不同参数设置下进行了训练和测试。使用基于gru的架构可以获得最高的结果(准确率= 87.28,F1 = 0.87)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Contextual Urdu Text Emotion Detection Corpus and Experiments using Deep Learning Approaches
Textual emotion detection aims to discover human emotions from written text. Textual emotion detection is a significant challenge due to the unavailability of facial and voice expressions. Considerable research has been done to identify textual emotions in high-resource languages such as English, French, Chinese, and others. Despite having over 300 million speakers and large volumes of literature available online, Urdu has not been properly investigated for the textual emotion detection task. To address this gap, this study makes two contributions: (1) the creation of a novel dialog-based corpus for Urdu (Contextual Urdu Text Emotion Detection Corpus). CUTEC contains 30,160 training and 5,509 testing labelled dialogues, where each dialogue consists of three Urdu contextual sentences. In addition, all dialogues are labelled using four emotion classes, i.e., Happy, Sad, Angry, and Other. As a second contribution (2) five deep learning models, i.e., RNN, LSTM, Bi- LSTM, GRU, and Bi-GRU have been trained and tested using CUTEC with different parametric settings. The highest results (Accuracy = 87.28 and F1 = 0.87) are attained using a GRU-based architecture.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.40
自引率
0.00%
发文量
22
审稿时长
4 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信