{"title":"多模态情感分析的多层表示解纠缠框架","authors":"Nan Jia;Zicong Bai;Tiancheng Xiong;Mingyang Guo","doi":"10.1109/LSP.2025.3562827","DOIUrl":null,"url":null,"abstract":"Multimodal Sentiment Analysis (MSA) has gained wide attention in many fields in recent years. However, the problem of heterogeneity and redundant information among different signals seriously affects the extraction and fusion of sentiment features. To address this challenge, we propose a Multilevel Representational Disentanglement Framework (MRDF) to achieve effective modality fusion and produce refined joint multimodal representations. Specifically, we design a refined semantic decomposition module for learning task-shared representations and modality-exclusive representations by crossmodal translations and task semantic reconstruction. Furthermore, we propose a contrastive learning-based distribution alignment mechanism and an adversarial learning-based distribution alignment strategy to utilize contrastive adversarial learning paradigms to further align the disentangled task-shared representations Experimental results show that the MRDF framework significantly outperforms existing state-of-the-art methods on the MOSI and MOSEI benchmarks.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1895-1899"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multilevel Representation Disentanglement Framework for Multimodal Sentiment Analysis\",\"authors\":\"Nan Jia;Zicong Bai;Tiancheng Xiong;Mingyang Guo\",\"doi\":\"10.1109/LSP.2025.3562827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal Sentiment Analysis (MSA) has gained wide attention in many fields in recent years. However, the problem of heterogeneity and redundant information among different signals seriously affects the extraction and fusion of sentiment features. To address this challenge, we propose a Multilevel Representational Disentanglement Framework (MRDF) to achieve effective modality fusion and produce refined joint multimodal representations. Specifically, we design a refined semantic decomposition module for learning task-shared representations and modality-exclusive representations by crossmodal translations and task semantic reconstruction. Furthermore, we propose a contrastive learning-based distribution alignment mechanism and an adversarial learning-based distribution alignment strategy to utilize contrastive adversarial learning paradigms to further align the disentangled task-shared representations Experimental results show that the MRDF framework significantly outperforms existing state-of-the-art methods on the MOSI and MOSEI benchmarks.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"32 \",\"pages\":\"1895-1899\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10971202/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10971202/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Multilevel Representation Disentanglement Framework for Multimodal Sentiment Analysis
Multimodal Sentiment Analysis (MSA) has gained wide attention in many fields in recent years. However, the problem of heterogeneity and redundant information among different signals seriously affects the extraction and fusion of sentiment features. To address this challenge, we propose a Multilevel Representational Disentanglement Framework (MRDF) to achieve effective modality fusion and produce refined joint multimodal representations. Specifically, we design a refined semantic decomposition module for learning task-shared representations and modality-exclusive representations by crossmodal translations and task semantic reconstruction. Furthermore, we propose a contrastive learning-based distribution alignment mechanism and an adversarial learning-based distribution alignment strategy to utilize contrastive adversarial learning paradigms to further align the disentangled task-shared representations Experimental results show that the MRDF framework significantly outperforms existing state-of-the-art methods on the MOSI and MOSEI benchmarks.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.