BC-PMJRS：脑计算启发的预定义多模态联合表征空间，用于加强跨模态学习

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-04-10 DOI:10.1016/j.neunet.2025.107449

Jiahao Qin , Feng Liu , Lu Zong

{"title":"BC-PMJRS：脑计算启发的预定义多模态联合表征空间，用于加强跨模态学习","authors":"Jiahao Qin , Feng Liu , Lu Zong","doi":"10.1016/j.neunet.2025.107449","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107449"},"PeriodicalIF":6.0000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning\",\"authors\":\"Jiahao Qin , Feng Liu , Lu Zong\",\"doi\":\"10.1016/j.neunet.2025.107449\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"188 \",\"pages\":\"Article 107449\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025003284\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025003284","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多模态学习面临两大挑战：有效融合来自不同模态的复杂信息，以及设计高效的跨模态交互机制。受人脑的神经可塑性和信息处理原理启发，本文提出了一种脑计算启发的预定义多模态联合表征空间方法 BC-PMJRS，以增强跨模态学习。该方法通过两个互补的优化目标来学习联合表征空间：（1）最小化不同模态表征之间的互信息，以减少冗余；（2）最大化联合表征和情感标签之间的互信息，以提高特定任务的辨别能力。受长期延时（LTP）和长期抑制（LTD）机制的启发，我们采用自适应优化策略动态地平衡这些目标。此外，我们还利用全局-局部跨模态交互机制（类似于大脑中的选择性注意），大大降低了模态交互的计算复杂性。在 IEMOCAP、MOSI 和 MOSEI 数据集上的实验结果表明，BC-PMJRS 在完全模态和不完全模态设置中的表现都优于最先进的模型，在 IEMOCAP 上的加权-F1 提高了 1.9%，在 MOSI 上的 7 级准确率提高了 2.8%，在 MOSEI 上的 7 级准确率提高了 2.9%。这些在多个数据集上的显著改进表明，通过神经可塑性原理，结合大脑启发机制，特别是信息冗余和任务相关性的动态平衡，可以有效地增强多模态学习。这项工作将神经科学原理与多模态机器学习结合起来，为开发更有效、更符合生物学原理的模型提供了新的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BC-PMJRS: A Brain Computing-inspired Predefined Multimodal Joint Representation Spaces for enhanced cross-modal learning

Multimodal learning faces two key challenges: effectively fusing complex information from different modalities, and designing efficient mechanisms for cross-modal interactions. Inspired by neural plasticity and information processing principles in the human brain, this paper proposes BC-PMJRS, a Brain Computing-inspired Predefined Multimodal Joint Representation Spaces method to enhance cross-modal learning. The method learns the joint representation space through two complementary optimization objectives: (1) minimizing mutual information between representations of different modalities to reduce redundancy and (2) maximizing mutual information between joint representations and sentiment labels to improve task-specific discrimination. These objectives are balanced dynamically using an adaptive optimization strategy inspired by long-term potentiation (LTP) and long-term depression (LTD) mechanisms. Furthermore, we significantly reduce the computational complexity of modal interactions by leveraging a global–local cross-modal interaction mechanism, analogous to selective attention in the brain. Experimental results on the IEMOCAP, MOSI, and MOSEI datasets demonstrate that BC-PMJRS outperforms state-of-the-art models in both complete and incomplete modality settings, achieving up to a 1.9% improvement in weighted-F1 on IEMOCAP, a 2.8% gain in 7-class accuracy on MOSI, and a 2.9% increase in 7-class accuracy on MOSEI. These substantial improvements across multiple datasets demonstrate that incorporating brain-inspired mechanisms, particularly the dynamic balance of information redundancy and task relevance through neural plasticity principles, effectively enhances multimodal learning. This work bridges neuroscience principles with multimodal machine learning, offering new insights for developing more effective and biologically plausible models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.