表情符号是否足以训练阿拉伯语推文的情感分类器?

2016 7th International Conference on Computer Science and Information Technology (CSIT) Pub Date : 2016-07-13 DOI:10.1109/CSIT.2016.7549459

Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi

{"title":"表情符号是否足以训练阿拉伯语推文的情感分类器?","authors":"Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi","doi":"10.1109/CSIT.2016.7549459","DOIUrl":null,"url":null,"abstract":"Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.","PeriodicalId":210905,"journal":{"name":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"Are emoticons good enough to train emotion classifiers of Arabic tweets?\",\"authors\":\"Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi\",\"doi\":\"10.1109/CSIT.2016.7549459\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.\",\"PeriodicalId\":210905,\"journal\":{\"name\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSIT.2016.7549459\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSIT.2016.7549459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 48

摘要

如今，情绪的自动检测被应用于许多不同领域，如安全信息学、电子学习、幽默检测、定向广告等。这些应用程序中的许多都侧重于社交媒体。在这项研究中，我们解决了阿拉伯语推文中的情感检测问题。我们专注于这个问题的监督方法，其中分类器是在已经标记的数据集上训练的。通常，这样的训练集是手动标注的，这是昂贵和耗时的。emojis是新一代的表情符号，我们提出了一种基于emojis的自动标注训练数据的方法。我们表明，这种方法产生的分类器比在手动注释数据集上训练的分类器更准确。为了实现我们的目标，我们构建了一个阿拉伯语情绪推文数据集，其中考虑的情绪类别是:愤怒、厌恶、喜悦和悲伤。此外，我们考虑了两种分类器:支持向量机(SVM)和多项朴素贝叶斯(MNB)。实验结果表明，基于SVM和MNB的自动标注方法优于人工标注方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Are emoticons good enough to train emotion classifiers of Arabic tweets?

Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 7th International Conference on Computer Science and Information Technology (CSIT)

自引率

0.00%

发文量