{"title":"BC-PMJRS:脑计算启发的预定义多模态联合表征空间,用于加强跨模态学习","authors":"Jiahao Qin , Feng Liu , Lu Zong","doi":"10.1016/j.neunet.2025.107449","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107449"},"PeriodicalIF":6.0000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning\",\"authors\":\"Jiahao Qin , Feng Liu , Lu Zong\",\"doi\":\"10.1016/j.neunet.2025.107449\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"188 \",\"pages\":\"Article 107449\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025003284\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025003284","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning
Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.