The Influence of Annotation, Corpus Design, and Evaluation on the Outcome of Automatic Classification of Human Emotions

Q1 Computer Science
Markus Kächele, Martin Schels, F. Schwenker
{"title":"The Influence of Annotation, Corpus Design, and Evaluation on the Outcome of Automatic Classification of Human Emotions","authors":"Markus Kächele, Martin Schels, F. Schwenker","doi":"10.3389/fict.2016.00027","DOIUrl":null,"url":null,"abstract":"The integration of emotions into human computer interaction applications promises a more natural dialog between the user and the technical system he operates. In order to construct such machinery, continuous measuring of the affective state of the user becomes essential. While basic research that is aimed to capture and classify affective signals has progressed, many issues are still prevailing that hinder easy integration of affective signals into human-computer interaction. In this paper, we identify and investigate pitfalls in three steps of the work-flow of affective classification studies. It starts with the process of collecting affective data for the purpose of training suitable classifiers. Emotional data has to be created in which the target emotions are present. Therefore, human participants have to be stimulated suitably. We discuss the nature of these stimuli, their relevance to human-computer interaction and the repeatability of the data recording setting. Second, aspects of annotation procedures are investigated, which include the variances of individual raters, annotation delay, the impact of the used annotation tool and how individual ratings are combined to a unified label. Finally, the evaluation protocol is examined which includes, amongst others, the impact of the performance measure on the accuracy of a classification model. We hereby focus especially on the evaluation of classifier outputs against continuously annotated dimensions. Alongside the discussed problems and pitfalls and the ways how they affect the outcome, we provide solutions and alternatives to overcome these issues. As a final part of the paper we sketch a recording scenario and a set of supporting technologies that can contribute to solve many of the issues mentioned above.","PeriodicalId":37157,"journal":{"name":"Frontiers in ICT","volume":"45 1","pages":"27"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in ICT","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fict.2016.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 7

Abstract

The integration of emotions into human computer interaction applications promises a more natural dialog between the user and the technical system he operates. In order to construct such machinery, continuous measuring of the affective state of the user becomes essential. While basic research that is aimed to capture and classify affective signals has progressed, many issues are still prevailing that hinder easy integration of affective signals into human-computer interaction. In this paper, we identify and investigate pitfalls in three steps of the work-flow of affective classification studies. It starts with the process of collecting affective data for the purpose of training suitable classifiers. Emotional data has to be created in which the target emotions are present. Therefore, human participants have to be stimulated suitably. We discuss the nature of these stimuli, their relevance to human-computer interaction and the repeatability of the data recording setting. Second, aspects of annotation procedures are investigated, which include the variances of individual raters, annotation delay, the impact of the used annotation tool and how individual ratings are combined to a unified label. Finally, the evaluation protocol is examined which includes, amongst others, the impact of the performance measure on the accuracy of a classification model. We hereby focus especially on the evaluation of classifier outputs against continuously annotated dimensions. Alongside the discussed problems and pitfalls and the ways how they affect the outcome, we provide solutions and alternatives to overcome these issues. As a final part of the paper we sketch a recording scenario and a set of supporting technologies that can contribute to solve many of the issues mentioned above.
标注、语料库设计与评价对人类情感自动分类结果的影响
将情感集成到人机交互应用程序中,保证了用户与其操作的技术系统之间更自然的对话。为了构建这样的机器,持续测量用户的情感状态变得至关重要。虽然旨在捕捉和分类情感信号的基础研究取得了进展,但许多问题仍然普遍存在,阻碍了情感信号与人机交互的轻松整合。在本文中,我们识别和研究了情感分类研究工作流程的三个步骤中的陷阱。它从收集情感数据的过程开始,以训练合适的分类器。必须创建目标情绪所包含的情绪数据。因此,人类参与者必须受到适当的刺激。我们讨论了这些刺激的性质,它们与人机交互的相关性以及数据记录设置的可重复性。其次,研究了标注过程的各个方面,包括个体评分者的差异、标注延迟、所使用的标注工具的影响以及如何将个体评分合并到一个统一的标签中。最后,对评估协议进行了检查,其中包括性能度量对分类模型准确性的影响。因此,我们特别关注针对连续注释维度的分类器输出的评估。除了讨论的问题和陷阱以及它们如何影响结果之外,我们还提供了克服这些问题的解决方案和替代方案。作为本文的最后一部分,我们概述了一个记录场景和一组支持技术,可以帮助解决上面提到的许多问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Frontiers in ICT
Frontiers in ICT Computer Science-Computer Networks and Communications
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信