Improving Supervised Learning in Conversational Analysis through Reusing Preprocessing Data as Auxiliary Supervisors

Companion Publication of the 2022 International Conference on Multimodal Interaction Pub Date : 2022-11-07 DOI:10.1145/3536220.3558034

Joshua Y. Kim, Tongliang Liu, K. Yacef

{"title":"Improving Supervised Learning in Conversational Analysis through Reusing Preprocessing Data as Auxiliary Supervisors","authors":"Joshua Y. Kim, Tongliang Liu, K. Yacef","doi":"10.1145/3536220.3558034","DOIUrl":null,"url":null,"abstract":"Emotions recognition systems are trained using noisy human labels and often require heavy preprocessing during multi-modal feature extraction. Using noisy labels in single-task learning increases the risk of over-fitting. Auxiliary tasks could improve the performance of the primary task learning during the same training – multi-task learning (MTL). In this paper, we explore how the preprocessed data used for creating the textual multimodal description of the conversation, that supports conversational analysis, can be re-used as auxiliary tasks (e.g. predicting future or previous labels and predicting the ranked expressions of actions and prosody), thereby promoting the productive use of data. Our main contributions are: (1) the identification of sixteen beneficially auxiliary tasks, (2) studying the method of distributing learning capacity between the primary and auxiliary tasks, and (3) studying the relative supervision hierarchy between the primary and auxiliary tasks. Extensive experiments on IEMOCAP and SEMAINE data validate the improvements over single-task approaches, and suggest that it may generalize across multiple primary tasks.","PeriodicalId":186796,"journal":{"name":"Companion Publication of the 2022 International Conference on Multimodal Interaction","volume":"184 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Publication of the 2022 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3536220.3558034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Emotions recognition systems are trained using noisy human labels and often require heavy preprocessing during multi-modal feature extraction. Using noisy labels in single-task learning increases the risk of over-fitting. Auxiliary tasks could improve the performance of the primary task learning during the same training – multi-task learning (MTL). In this paper, we explore how the preprocessed data used for creating the textual multimodal description of the conversation, that supports conversational analysis, can be re-used as auxiliary tasks (e.g. predicting future or previous labels and predicting the ranked expressions of actions and prosody), thereby promoting the productive use of data. Our main contributions are: (1) the identification of sixteen beneficially auxiliary tasks, (2) studying the method of distributing learning capacity between the primary and auxiliary tasks, and (3) studying the relative supervision hierarchy between the primary and auxiliary tasks. Extensive experiments on IEMOCAP and SEMAINE data validate the improvements over single-task approaches, and suggest that it may generalize across multiple primary tasks.

查看原文本刊更多论文

再利用预处理数据作为辅助监督，改善会话分析中的监督学习

情绪识别系统是使用嘈杂的人类标签进行训练的，在多模态特征提取过程中往往需要大量的预处理。在单任务学习中使用噪声标签会增加过度拟合的风险。在同一训练中，辅助任务可以提高主任务学习的表现。在本文中，我们探讨了如何将用于创建会话文本多模态描述的预处理数据重新用作辅助任务(例如，预测未来或以前的标签，预测动作和韵律的排名表达)，从而促进数据的生产性使用。我们的主要贡献有:(1)识别16个有益的辅助任务;(2)研究在主要任务和辅助任务之间分配学习能力的方法;(3)研究主要任务和辅助任务之间的相对监督层次。在IEMOCAP和SEMAINE数据上进行的大量实验验证了对单任务方法的改进，并表明它可以推广到多个主要任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Companion Publication of the 2022 International Conference on Multimodal Interaction

自引率

0.00%

发文量