Yuqing Zhang , Dongliang Xie , Dawei Luo , Baosheng Sun
{"title":"Modality emotion semantic correlation analysis for multimodal emotion recognition","authors":"Yuqing Zhang , Dongliang Xie , Dawei Luo , Baosheng Sun","doi":"10.1016/j.compeleceng.2025.110467","DOIUrl":null,"url":null,"abstract":"<div><div>Affective computing serves as the fundamental technology and a crucial prerequisite for attaining naturalized and anthropomorphic human–computer interaction. Nevertheless, the expression of emotion is complex and multi-dimensional, posing significant challenges for multimodal emotion recognition due to the heterogeneity gap among distinct modalities. To tackle this issue, we propose a novel approach named modality emotion semantic correlation analysis (MESCA), which enhances multimodal affective semantic consistency by leveraging modality correlation learning to achieve multimodal information complementation. Specifically, we first design a modal-pair correlation module that calculates emotion semantic consistency across text, audio and video information. This module contributes to a comprehensive understanding of the emotional state by fusing complementary semantic information and assists in mitigating redundancy in pairwise interaction methods. Next, we introduce structural re-parameterization technology that transforms the multi-branch training structure into a single-branch inference structure to solve the problem of excessive computational expense, thereby facilitating a more efficient and effective recognition process. Additionally, the proposed model is verified on two public datasets, IEMOCAP and CMU-MOSEI. Compared to baseline methods, MESCA significantly enhances efficiency while maintaining prediction accuracy on IEMOCAP, and outperforms on both efficiency and accuracy on CMU-MOSEI.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"126 ","pages":"Article 110467"},"PeriodicalIF":4.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625004100","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Affective computing serves as the fundamental technology and a crucial prerequisite for attaining naturalized and anthropomorphic human–computer interaction. Nevertheless, the expression of emotion is complex and multi-dimensional, posing significant challenges for multimodal emotion recognition due to the heterogeneity gap among distinct modalities. To tackle this issue, we propose a novel approach named modality emotion semantic correlation analysis (MESCA), which enhances multimodal affective semantic consistency by leveraging modality correlation learning to achieve multimodal information complementation. Specifically, we first design a modal-pair correlation module that calculates emotion semantic consistency across text, audio and video information. This module contributes to a comprehensive understanding of the emotional state by fusing complementary semantic information and assists in mitigating redundancy in pairwise interaction methods. Next, we introduce structural re-parameterization technology that transforms the multi-branch training structure into a single-branch inference structure to solve the problem of excessive computational expense, thereby facilitating a more efficient and effective recognition process. Additionally, the proposed model is verified on two public datasets, IEMOCAP and CMU-MOSEI. Compared to baseline methods, MESCA significantly enhances efficiency while maintaining prediction accuracy on IEMOCAP, and outperforms on both efficiency and accuracy on CMU-MOSEI.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.