Semantic Disentangling for Audiovisual Induced Emotion

IF 4.5 2区计算机科学 Q1 COMPUTER SCIENCE, CYBERNETICS

IEEE Transactions on Computational Social Systems Pub Date : 2024-09-16 DOI:10.1109/TCSS.2024.3450717

Qunxi Dong;Wang Zheng;Fuze Tian;Lixian Zhu;Kun Qian;Jingyu Liu;Xuan Zhang

{"title":"Semantic Disentangling for Audiovisual Induced Emotion","authors":"Qunxi Dong;Wang Zheng;Fuze Tian;Lixian Zhu;Kun Qian;Jingyu Liu;Xuan Zhang","doi":"10.1109/TCSS.2024.3450717","DOIUrl":null,"url":null,"abstract":"Emotions regulation play an important role in human behavior, but exhibit considerable heterogeneity among individuals, which attenuates the generalization ability of emotion models. In this work, we aim to achieve robust emotion prediction through efficient disentanglement of affective semantic representations. In detail, the data generation mechanism behind observations from different perspectives is causally set, where latent variables that relate to emotion are explicitly separate into three parts: the intrinsic-related part, the extrinsic-related part, and the spurious-related part. Affective semantic features consist of the first two parts, with the understanding that spurious latent variables generate the inherent biases in the data. Furthermore, a variational autoencoder with a reformulated objective function is proposed to learn such disentangled latent variables, and only adopts semantic representations to perform the final classification task, avoiding the interference of spurious variables. In addition, for electroencephalography (EEG) data used in this article, a space-frequency mapping method is introduced to improve information utilization. Comprehensive experiments on popular emotion datasets show that the proposed method can achieve competitive intersubject generalization performance. Our results highlight the potential of efficient latent representation disentanglement in addressing the complexity challenges of emotion recognition.","PeriodicalId":13044,"journal":{"name":"IEEE Transactions on Computational Social Systems","volume":"12 2","pages":"928-936"},"PeriodicalIF":4.5000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Social Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10680465/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}

引用次数: 0

Abstract

Emotions regulation play an important role in human behavior, but exhibit considerable heterogeneity among individuals, which attenuates the generalization ability of emotion models. In this work, we aim to achieve robust emotion prediction through efficient disentanglement of affective semantic representations. In detail, the data generation mechanism behind observations from different perspectives is causally set, where latent variables that relate to emotion are explicitly separate into three parts: the intrinsic-related part, the extrinsic-related part, and the spurious-related part. Affective semantic features consist of the first two parts, with the understanding that spurious latent variables generate the inherent biases in the data. Furthermore, a variational autoencoder with a reformulated objective function is proposed to learn such disentangled latent variables, and only adopts semantic representations to perform the final classification task, avoiding the interference of spurious variables. In addition, for electroencephalography (EEG) data used in this article, a space-frequency mapping method is introduced to improve information utilization. Comprehensive experiments on popular emotion datasets show that the proposed method can achieve competitive intersubject generalization performance. Our results highlight the potential of efficient latent representation disentanglement in addressing the complexity challenges of emotion recognition.

查看原文本刊更多论文

视听诱发情绪的语义解纠结

情绪调节在人类行为中发挥着重要作用，但在个体之间表现出较大的异质性，这削弱了情绪模型的泛化能力。在这项工作中，我们的目标是通过有效地解开情感语义表征来实现稳健的情感预测。具体而言，不同视角观察背后的数据生成机制是因果设定的，其中与情绪相关的潜在变量明确分为三部分：内在相关部分、外在相关部分和虚假相关部分。情感语义特征包括前两个部分，理解虚假的潜在变量会在数据中产生固有的偏差。在此基础上，提出了一种具有重新表述的目标函数的变分自编码器来学习这些解纠缠的潜在变量，并仅采用语义表示来完成最终的分类任务，避免了伪变量的干扰。此外，对于本文使用的脑电图（EEG）数据，引入了空频映射方法来提高信息利用率。在流行的情感数据集上进行的综合实验表明，该方法可以取得具有竞争力的主题间泛化性能。我们的研究结果强调了有效的潜在表征解纠缠在解决情感识别的复杂性挑战方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Computational Social Systems Social Sciences-Social Sciences (miscellaneous)

CiteScore

10.00

自引率

20.00%

发文量

316

期刊介绍： IEEE Transactions on Computational Social Systems focuses on such topics as modeling, simulation, analysis and understanding of social systems from the quantitative and/or computational perspective. "Systems" include man-man, man-machine and machine-machine organizations and adversarial situations as well as social media structures and their dynamics. More specifically, the proposed transactions publishes articles on modeling the dynamics of social systems, methodologies for incorporating and representing socio-cultural and behavioral aspects in computational modeling, analysis of social system behavior and structure, and paradigms for social systems modeling and simulation. The journal also features articles on social network dynamics, social intelligence and cognition, social systems design and architectures, socio-cultural modeling and representation, and computational behavior modeling, and their applications.