Joint Discrete and Continuous Emotion Prediction Using Ensemble and End-to-End Approaches

Ehab Albadawy, Yelin Kim
{"title":"Joint Discrete and Continuous Emotion Prediction Using Ensemble and End-to-End Approaches","authors":"Ehab Albadawy, Yelin Kim","doi":"10.1145/3242969.3242972","DOIUrl":null,"url":null,"abstract":"This paper presents a novel approach in continuous emotion prediction that characterizes dimensional emotion labels jointly with continuous and discretized representations. Continuous emotion labels can capture subtle emotion variations, but their inherent noise often has negative effects on model training. Recent approaches found a performance gain when converting the continuous labels into a discrete set (e.g., using k-means clustering), despite a label quantization error. To find the optimal trade-off between the continuous and discretized emotion representations, we investigate two joint modeling approaches: ensemble and end-to-end. The ensemble model combines the predictions from two models that are trained separately, one with discretized prediction and the other with continuous prediction. On the other hand, the end-to-end model is trained to simultaneously optimize both discretized and continuous prediction tasks in addition to the final combination between them. Our experimental results using the state-of-the-art deep BLSTM network on the RECOLA dataset demonstrate that (i) the joint representation outperforms both individual representation baselines and the state-of-the-art speech based results on RECOLA, validating the assumption that combining continuous and discretized emotion representations yields better performance in emotion prediction; and (ii) the joint representation can help to accelerate convergence, particularly for valence prediction. Our work provides insights into joint discrete and continuous emotion representation and its efficacy for describing dynamically changing affective behavior in valence and activation prediction.","PeriodicalId":308751,"journal":{"name":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","volume":"130 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3242969.3242972","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

This paper presents a novel approach in continuous emotion prediction that characterizes dimensional emotion labels jointly with continuous and discretized representations. Continuous emotion labels can capture subtle emotion variations, but their inherent noise often has negative effects on model training. Recent approaches found a performance gain when converting the continuous labels into a discrete set (e.g., using k-means clustering), despite a label quantization error. To find the optimal trade-off between the continuous and discretized emotion representations, we investigate two joint modeling approaches: ensemble and end-to-end. The ensemble model combines the predictions from two models that are trained separately, one with discretized prediction and the other with continuous prediction. On the other hand, the end-to-end model is trained to simultaneously optimize both discretized and continuous prediction tasks in addition to the final combination between them. Our experimental results using the state-of-the-art deep BLSTM network on the RECOLA dataset demonstrate that (i) the joint representation outperforms both individual representation baselines and the state-of-the-art speech based results on RECOLA, validating the assumption that combining continuous and discretized emotion representations yields better performance in emotion prediction; and (ii) the joint representation can help to accelerate convergence, particularly for valence prediction. Our work provides insights into joint discrete and continuous emotion representation and its efficacy for describing dynamically changing affective behavior in valence and activation prediction.
基于集成和端到端方法的联合离散和连续情绪预测
本文提出了一种新的连续情绪预测方法,该方法将连续和离散表征相结合来表征维度情绪标签。连续的情绪标签可以捕捉细微的情绪变化,但其固有的噪声往往对模型训练产生负面影响。最近的方法发现,在将连续标签转换为离散集(例如,使用k-means聚类)时,尽管存在标签量化误差,但性能有所提高。为了找到连续和离散情感表征之间的最佳权衡,我们研究了两种联合建模方法:集成和端到端。集成模型结合了两个单独训练的模型的预测,一个是离散预测,另一个是连续预测。另一方面,训练端到端模型,以同时优化离散化和连续化预测任务以及两者的最终组合。我们在RECOLA数据集上使用最先进的深度BLSTM网络的实验结果表明:(i)联合表示优于个体表示基线和基于RECOLA最先进语音的结果,验证了连续和离散情感表示相结合在情感预测中产生更好性能的假设;(ii)联合表示有助于加速收敛,特别是对于价预测。我们的工作提供了对联合离散和连续情绪表征及其在效价和激活预测中描述动态变化的情感行为的功效的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信