{"title":"基于不完全模态的情感分析中情态特定表示的增强","authors":"Xin Jiang;Lihuo He;Fei Gao;Kaifan Zhang;Jie Li;Xinbo Gao","doi":"10.1109/TMM.2025.3590909","DOIUrl":null,"url":null,"abstract":"Multimodal sentiment analysis aims at exploiting complementary information from multiple modalities or data sources to enhance the understanding and interpretation of sentiment. While existing multi-modal fusion techniques offer significant improvements in sentiment analysis, real-world scenarios often involve missing modalities, introducing complexity due to uncertainty of which modalities may be absent. To tackle the challenge of incomplete modality-specific feature extraction caused by missing modalities, this paper proposes a Cosine Margin-Aware Network (CMANet) which centers on the Cosine Margin-Aware Distillation (CMAD) module. The core module measures distance between samples and the classification boundary, enabling CMANet to focus on samples near the boundary. So, it effectively captures the unique features of different modal combinations. To address the issue of modality imbalance during modality-specific feature extraction, this paper proposes a Weak Modality Regularization (WMR) strategy, which aligns the feature distributions between strong and weak modalities at the dataset-level, while also enhancing the prediction loss of samples at the sample-level. This dual mechanism improves the recognition robustness of weak modality combination. Extensive experiments demonstrate that the proposed method outperforms the previous best model, MMIN, with a 3.82% improvement in unweighted accuracy. These results underscore the robustness of the approach under conditions of uncertain and missing modalities.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"6793-6804"},"PeriodicalIF":9.7000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Boosting Modal-Specific Representations for Sentiment Analysis With Incomplete Modalities\",\"authors\":\"Xin Jiang;Lihuo He;Fei Gao;Kaifan Zhang;Jie Li;Xinbo Gao\",\"doi\":\"10.1109/TMM.2025.3590909\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal sentiment analysis aims at exploiting complementary information from multiple modalities or data sources to enhance the understanding and interpretation of sentiment. While existing multi-modal fusion techniques offer significant improvements in sentiment analysis, real-world scenarios often involve missing modalities, introducing complexity due to uncertainty of which modalities may be absent. To tackle the challenge of incomplete modality-specific feature extraction caused by missing modalities, this paper proposes a Cosine Margin-Aware Network (CMANet) which centers on the Cosine Margin-Aware Distillation (CMAD) module. The core module measures distance between samples and the classification boundary, enabling CMANet to focus on samples near the boundary. So, it effectively captures the unique features of different modal combinations. To address the issue of modality imbalance during modality-specific feature extraction, this paper proposes a Weak Modality Regularization (WMR) strategy, which aligns the feature distributions between strong and weak modalities at the dataset-level, while also enhancing the prediction loss of samples at the sample-level. This dual mechanism improves the recognition robustness of weak modality combination. Extensive experiments demonstrate that the proposed method outperforms the previous best model, MMIN, with a 3.82% improvement in unweighted accuracy. These results underscore the robustness of the approach under conditions of uncertain and missing modalities.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"6793-6804\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11086405/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11086405/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Boosting Modal-Specific Representations for Sentiment Analysis With Incomplete Modalities
Multimodal sentiment analysis aims at exploiting complementary information from multiple modalities or data sources to enhance the understanding and interpretation of sentiment. While existing multi-modal fusion techniques offer significant improvements in sentiment analysis, real-world scenarios often involve missing modalities, introducing complexity due to uncertainty of which modalities may be absent. To tackle the challenge of incomplete modality-specific feature extraction caused by missing modalities, this paper proposes a Cosine Margin-Aware Network (CMANet) which centers on the Cosine Margin-Aware Distillation (CMAD) module. The core module measures distance between samples and the classification boundary, enabling CMANet to focus on samples near the boundary. So, it effectively captures the unique features of different modal combinations. To address the issue of modality imbalance during modality-specific feature extraction, this paper proposes a Weak Modality Regularization (WMR) strategy, which aligns the feature distributions between strong and weak modalities at the dataset-level, while also enhancing the prediction loss of samples at the sample-level. This dual mechanism improves the recognition robustness of weak modality combination. Extensive experiments demonstrate that the proposed method outperforms the previous best model, MMIN, with a 3.82% improvement in unweighted accuracy. These results underscore the robustness of the approach under conditions of uncertain and missing modalities.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.