多模态情感分析的多层表示解纠缠框架

IF 3.2 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Signal Processing Letters Pub Date : 2025-04-21 DOI:10.1109/LSP.2025.3562827

Nan Jia;Zicong Bai;Tiancheng Xiong;Mingyang Guo

{"title":"多模态情感分析的多层表示解纠缠框架","authors":"Nan Jia;Zicong Bai;Tiancheng Xiong;Mingyang Guo","doi":"10.1109/LSP.2025.3562827","DOIUrl":null,"url":null,"abstract":"Multimodal Sentiment Analysis (MSA) has gained wide attention in many fields in recent years. However, the problem of heterogeneity and redundant information among different signals seriously affects the extraction and fusion of sentiment features. To address this challenge, we propose a Multilevel Representational Disentanglement Framework (MRDF) to achieve effective modality fusion and produce refined joint multimodal representations. Specifically, we design a refined semantic decomposition module for learning task-shared representations and modality-exclusive representations by crossmodal translations and task semantic reconstruction. Furthermore, we propose a contrastive learning-based distribution alignment mechanism and an adversarial learning-based distribution alignment strategy to utilize contrastive adversarial learning paradigms to further align the disentangled task-shared representations Experimental results show that the MRDF framework significantly outperforms existing state-of-the-art methods on the MOSI and MOSEI benchmarks.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1895-1899"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multilevel Representation Disentanglement Framework for Multimodal Sentiment Analysis\",\"authors\":\"Nan Jia;Zicong Bai;Tiancheng Xiong;Mingyang Guo\",\"doi\":\"10.1109/LSP.2025.3562827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal Sentiment Analysis (MSA) has gained wide attention in many fields in recent years. However, the problem of heterogeneity and redundant information among different signals seriously affects the extraction and fusion of sentiment features. To address this challenge, we propose a Multilevel Representational Disentanglement Framework (MRDF) to achieve effective modality fusion and produce refined joint multimodal representations. Specifically, we design a refined semantic decomposition module for learning task-shared representations and modality-exclusive representations by crossmodal translations and task semantic reconstruction. Furthermore, we propose a contrastive learning-based distribution alignment mechanism and an adversarial learning-based distribution alignment strategy to utilize contrastive adversarial learning paradigms to further align the disentangled task-shared representations Experimental results show that the MRDF framework significantly outperforms existing state-of-the-art methods on the MOSI and MOSEI benchmarks.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"32 \",\"pages\":\"1895-1899\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10971202/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10971202/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

近年来，多模态情感分析在许多领域得到了广泛的关注。然而，不同信号之间的异质性和信息冗余问题严重影响了情感特征的提取和融合。为了解决这一挑战，我们提出了一个多层表征解纠缠框架（MRDF）来实现有效的模态融合并产生精细的联合多模态表示。具体来说，我们设计了一个精细化的语义分解模块，通过跨模态翻译和任务语义重构来学习任务共享表示和模态独占表示。此外，我们提出了一种基于对比学习的分布对齐机制和一种基于对抗学习的分布对齐策略，利用对比对抗学习范式进一步对齐已解离的任务共享表示。实验结果表明，MRDF框架在MOSI和MOSEI基准上显著优于现有的最先进方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multilevel Representation Disentanglement Framework for Multimodal Sentiment Analysis

Multimodal Sentiment Analysis (MSA) has gained wide attention in many fields in recent years. However, the problem of heterogeneity and redundant information among different signals seriously affects the extraction and fusion of sentiment features. To address this challenge, we propose a Multilevel Representational Disentanglement Framework (MRDF) to achieve effective modality fusion and produce refined joint multimodal representations. Specifically, we design a refined semantic decomposition module for learning task-shared representations and modality-exclusive representations by crossmodal translations and task semantic reconstruction. Furthermore, we propose a contrastive learning-based distribution alignment mechanism and an adversarial learning-based distribution alignment strategy to utilize contrastive adversarial learning paradigms to further align the disentangled task-shared representations Experimental results show that the MRDF framework significantly outperforms existing state-of-the-art methods on the MOSI and MOSEI benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Signal Processing Letters 工程技术-工程：电子与电气

CiteScore

7.40

自引率

12.80%

发文量

339

审稿时长

2.8 months

期刊介绍： The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.