Multi-Emotion Estimation in Narratives from Crowdsourced Annotations

Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries Pub Date : 2015-06-21 DOI:10.1145/2756406.2756910

Lei Duan, S. Oyama, Haruhiko Sato, M. Kurihara

{"title":"Multi-Emotion Estimation in Narratives from Crowdsourced Annotations","authors":"Lei Duan, S. Oyama, Haruhiko Sato, M. Kurihara","doi":"10.1145/2756406.2756910","DOIUrl":null,"url":null,"abstract":"Emotion annotations are important metadata for narrative texts in digital libraries. Such annotations are necessary for automatic text-to-speech conversion of narratives and affective education support and can be used as training data for machine learning algorithms to train automatic emotion detectors. However, obtaining high-quality emotion annotations is a challenging problem because it is usually expensive and time-consuming due to the subjectivity of emotion. Moreover, due to the multiplicity of \"emotion\", emotion annotations more naturally fit the paradigm of multi-label classification than that of multi-class classification since one instance (such as a sentence) may evoke a combination of multiple emotion categories. We thus investigated ways to obtain a set of high-quality emotion annotations ({instance, multi-emotion} paired data) from variable-quality crowdsourced annotations. A common quality control strategy for crowdsourced labeling tasks is to aggregate the responses provided by multiple annotators to produce a reliable annotation. Given that the categories of \"emotion\" have characteristics different from those of other kinds of labels, we propose incorporating domain-specific information of emotional consistencies across instances and contextual cues among emotion categories into the aggregation process. Experimental results demonstrate that, from a limited number of crowdsourced annotations, the proposed models enable gold standards to be more effectively estimated than the majority vote and the original domain-independent model.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2756406.2756910","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Emotion annotations are important metadata for narrative texts in digital libraries. Such annotations are necessary for automatic text-to-speech conversion of narratives and affective education support and can be used as training data for machine learning algorithms to train automatic emotion detectors. However, obtaining high-quality emotion annotations is a challenging problem because it is usually expensive and time-consuming due to the subjectivity of emotion. Moreover, due to the multiplicity of "emotion", emotion annotations more naturally fit the paradigm of multi-label classification than that of multi-class classification since one instance (such as a sentence) may evoke a combination of multiple emotion categories. We thus investigated ways to obtain a set of high-quality emotion annotations ({instance, multi-emotion} paired data) from variable-quality crowdsourced annotations. A common quality control strategy for crowdsourced labeling tasks is to aggregate the responses provided by multiple annotators to produce a reliable annotation. Given that the categories of "emotion" have characteristics different from those of other kinds of labels, we propose incorporating domain-specific information of emotional consistencies across instances and contextual cues among emotion categories into the aggregation process. Experimental results demonstrate that, from a limited number of crowdsourced annotations, the proposed models enable gold standards to be more effectively estimated than the majority vote and the original domain-independent model.

查看原文本刊更多论文

众包注释叙事中的多情感估计

情感注释是数字图书馆叙事文本的重要元数据。这种标注对于叙事的自动文本到语音转换和情感教育支持是必要的，可以作为机器学习算法训练自动情感检测器的训练数据。然而，由于情感的主观性，获得高质量的情感注释是一个具有挑战性的问题，因为它通常是昂贵和耗时的。此外，由于“情感”的多重性，情感注释更自然地适合多标签分类的范式，而不是多类分类的范式，因为一个实例(如句子)可能唤起多个情感类别的组合。因此，我们研究了如何从可变质量的众包注释中获得一组高质量的情感注释({instance, multi-emotion}配对数据)。众包标注任务的常见质量控制策略是聚合多个注释者提供的响应，以生成可靠的注释。鉴于“情绪”类别具有不同于其他类型标签的特征，我们建议将跨实例的特定领域的情绪一致性信息和情绪类别之间的上下文线索纳入到聚合过程中。实验结果表明，在有限数量的众包注释中，所提出的模型能够比多数投票和原始的领域独立模型更有效地估计出金标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries

自引率

0.00%

发文量