Emotional inverse reasoning trees and dominant modality focus for emotion recognition in conversations

IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Shidan Wei, Xianying Huang, Chengyang Zhang
{"title":"Emotional inverse reasoning trees and dominant modality focus for emotion recognition in conversations","authors":"Shidan Wei,&nbsp;Xianying Huang,&nbsp;Chengyang Zhang","doi":"10.1016/j.knosys.2025.114035","DOIUrl":null,"url":null,"abstract":"<div><div>Emotion recognition in conversation (ERC) is crucial to advancing human–computer interaction. However, current methods often ignore the importance of keywords in emotional expression, neglecting both the emotional information these keywords convey and their dynamic variations. In addition, previous studies have not deeply considered the characteristics and commonalities of heterogeneous modalities before fusion, leading to noise accumulation and weakened intermodal interactions. During multimodal fusion, these methods have not effectively accounted for strength differences between modalities, particularly underestimating the notable influence of the text modality on ERC. Moreover, traditional research has made only limited attempts to enhance modality representation capabilities. To address these issues, we propose the Emotional Inverse Reasoning Trees and Dominant Modal Focus model (EIRT-DMF) for ERC. The model leverages commonsense knowledge to extract keywords from utterances and introduces an innovative emotional inverse reasoning tree structure to enhance textual semantic representation and strengthen the transmission of emotional cues. Meanwhile, we design a modality optimization module to handle intra-modality associations and cross-modality interactions. In the fusion phase, the text modality is employed as the dominant modality to gain a collaborative understanding of intermodal semantics. In addition, we introduce a hybrid knowledge-distillation mechanism that employs multilevel learning to generate higher-quality multimodal representations. Experiments on the IEMOCAP and MELD datasets indicate that EIRT-DMF achieved state-of-the-art (SOTA) performance compared to all baselines.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"326 ","pages":"Article 114035"},"PeriodicalIF":7.6000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125010809","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Emotion recognition in conversation (ERC) is crucial to advancing human–computer interaction. However, current methods often ignore the importance of keywords in emotional expression, neglecting both the emotional information these keywords convey and their dynamic variations. In addition, previous studies have not deeply considered the characteristics and commonalities of heterogeneous modalities before fusion, leading to noise accumulation and weakened intermodal interactions. During multimodal fusion, these methods have not effectively accounted for strength differences between modalities, particularly underestimating the notable influence of the text modality on ERC. Moreover, traditional research has made only limited attempts to enhance modality representation capabilities. To address these issues, we propose the Emotional Inverse Reasoning Trees and Dominant Modal Focus model (EIRT-DMF) for ERC. The model leverages commonsense knowledge to extract keywords from utterances and introduces an innovative emotional inverse reasoning tree structure to enhance textual semantic representation and strengthen the transmission of emotional cues. Meanwhile, we design a modality optimization module to handle intra-modality associations and cross-modality interactions. In the fusion phase, the text modality is employed as the dominant modality to gain a collaborative understanding of intermodal semantics. In addition, we introduce a hybrid knowledge-distillation mechanism that employs multilevel learning to generate higher-quality multimodal representations. Experiments on the IEMOCAP and MELD datasets indicate that EIRT-DMF achieved state-of-the-art (SOTA) performance compared to all baselines.
情绪逆向推理树和主导情态在会话情绪识别中的应用
对话中的情感识别是推进人机交互的关键。然而,目前的方法往往忽视了关键词在情绪表达中的重要性,忽视了关键词所传达的情绪信息及其动态变化。此外,以往的研究在融合前没有深入考虑异质模态的特征和共性,导致噪声积累和多式联运相互作用减弱。在多模态融合过程中,这些方法并没有有效地解释模态之间的强度差异,特别是低估了文本模态对ERC的显著影响。此外,传统研究对提高情态表征能力的尝试有限。为了解决这些问题,我们提出了ERC的情绪逆推理树和主导模态焦点模型(EIRT-DMF)。该模型利用常识性知识从话语中提取关键词,并引入创新的情感逆推理树结构来增强文本语义表示,加强情感线索的传递。同时,我们设计了一个模态优化模块来处理模态内关联和跨模态交互。在融合阶段,语篇情态被用作主导情态,以获得对多模态语义的协同理解。此外,我们引入了一种混合知识蒸馏机制,该机制采用多层学习来生成更高质量的多模态表示。在IEMOCAP和MELD数据集上的实验表明,与所有基线相比,EIRT-DMF达到了最先进(SOTA)的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信