{"title":"Uncertainty-Aware Graph Contrastive Fusion Network for multimodal physiological signal emotion recognition","authors":"Guangqiang Li, Ning Chen, Hongqing Zhu, Jing Li, Zhangyong Xu, Zhiying Zhu","doi":"10.1016/j.neunet.2025.107363","DOIUrl":null,"url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have been widely adopted to mine topological patterns contained in physiological signals for emotion recognition. However, since physiological signals are non-stationary and susceptible to various noises, there exists inter-sensor connectivity uncertainty in each modality. Such intra-modal connectivity uncertainty may further lead to inter-modal semantic gap uncertainty, which will cause the unimodal bias problem and greatly affect the fusion effectiveness. While, such issue has never been fully considered in existing multimodal fusion models. To this end, we proposed an Uncertainty-Aware Graph Contrastive Fusion Network (UAGCFNet) to fuse multimodal physiological signals effectively for emotion recognition. Firstly, a probabilistic model-based Uncertainty-Aware Graph Convolutional Network (UAGCN), which can estimate and quantify the inter-sensor connectivity uncertainty, is constructed for each modality to extract its uncertainty-aware graph representation. Secondly, a Transitive Contrastive Fusion (TCF) module, which combines the Criss-Cross Attention (CCA)-based fusion mechanism and Transitive Contrastive Learning (TCL)-based calibration strategy organically, is designed to achieve effective fusion of multimodal graph representations by eliminating the unimodal bias problem resulting from the inter-modal semantic gap uncertainty. Extensive experimental results on DEAP, DREAMER, and MPED datasets under both subject-dependent and subject-independent scenarios demonstrate that (i) the proposed model outperforms State-Of-The-Art (SOTA) multimodal fusion models with fewer parameters and lower computational complexity; (ii) each key module and loss function contributes significantly to the performance enhancement of the proposed model; (iii) the proposed model can eliminate the unimodal bias problem effectively.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107363"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025002424","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Graph Neural Networks (GNNs) have been widely adopted to mine topological patterns contained in physiological signals for emotion recognition. However, since physiological signals are non-stationary and susceptible to various noises, there exists inter-sensor connectivity uncertainty in each modality. Such intra-modal connectivity uncertainty may further lead to inter-modal semantic gap uncertainty, which will cause the unimodal bias problem and greatly affect the fusion effectiveness. While, such issue has never been fully considered in existing multimodal fusion models. To this end, we proposed an Uncertainty-Aware Graph Contrastive Fusion Network (UAGCFNet) to fuse multimodal physiological signals effectively for emotion recognition. Firstly, a probabilistic model-based Uncertainty-Aware Graph Convolutional Network (UAGCN), which can estimate and quantify the inter-sensor connectivity uncertainty, is constructed for each modality to extract its uncertainty-aware graph representation. Secondly, a Transitive Contrastive Fusion (TCF) module, which combines the Criss-Cross Attention (CCA)-based fusion mechanism and Transitive Contrastive Learning (TCL)-based calibration strategy organically, is designed to achieve effective fusion of multimodal graph representations by eliminating the unimodal bias problem resulting from the inter-modal semantic gap uncertainty. Extensive experimental results on DEAP, DREAMER, and MPED datasets under both subject-dependent and subject-independent scenarios demonstrate that (i) the proposed model outperforms State-Of-The-Art (SOTA) multimodal fusion models with fewer parameters and lower computational complexity; (ii) each key module and loss function contributes significantly to the performance enhancement of the proposed model; (iii) the proposed model can eliminate the unimodal bias problem effectively.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.