表情符号是否足以训练阿拉伯语推文的情感分类器?

Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi
{"title":"表情符号是否足以训练阿拉伯语推文的情感分类器?","authors":"Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi","doi":"10.1109/CSIT.2016.7549459","DOIUrl":null,"url":null,"abstract":"Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.","PeriodicalId":210905,"journal":{"name":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"Are emoticons good enough to train emotion classifiers of Arabic tweets?\",\"authors\":\"Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi\",\"doi\":\"10.1109/CSIT.2016.7549459\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.\",\"PeriodicalId\":210905,\"journal\":{\"name\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSIT.2016.7549459\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSIT.2016.7549459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

摘要

如今,情绪的自动检测被应用于许多不同领域,如安全信息学、电子学习、幽默检测、定向广告等。这些应用程序中的许多都侧重于社交媒体。在这项研究中,我们解决了阿拉伯语推文中的情感检测问题。我们专注于这个问题的监督方法,其中分类器是在已经标记的数据集上训练的。通常,这样的训练集是手动标注的,这是昂贵和耗时的。emojis是新一代的表情符号,我们提出了一种基于emojis的自动标注训练数据的方法。我们表明,这种方法产生的分类器比在手动注释数据集上训练的分类器更准确。为了实现我们的目标,我们构建了一个阿拉伯语情绪推文数据集,其中考虑的情绪类别是:愤怒、厌恶、喜悦和悲伤。此外,我们考虑了两种分类器:支持向量机(SVM)和多项朴素贝叶斯(MNB)。实验结果表明,基于SVM和MNB的自动标注方法优于人工标注方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Are emoticons good enough to train emotion classifiers of Arabic tweets?
Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信