多模态学习再平衡：提高性能的负相关集成

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-06-10 DOI:10.1016/j.neunet.2025.107686

Zhixian Wang , Tao Zhang , Wu Huang

{"title":"多模态学习再平衡：提高性能的负相关集成","authors":"Zhixian Wang , Tao Zhang , Wu Huang","doi":"10.1016/j.neunet.2025.107686","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107686"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal learning rebalanced: Negative correlation ensembles for improved performance\",\"authors\":\"Zhixian Wang , Tao Zhang , Wu Huang\",\"doi\":\"10.1016/j.neunet.2025.107686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"190 \",\"pages\":\"Article 107686\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025005660\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025005660","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多模态学习旨在整合来自不同模态的信息，以克服单模态信息的局限性。最近的研究表明，多模态学习方法往往侧重于优化一个主导模态，导致模型性能发展不完整，有时甚至不如单模态模型，这种现象被称为模态不平衡问题。为了克服这一问题，一些研究在识别主导模态设计的基础上自适应调整梯度或损失函数。然而，它们在增强非优势模态的收敛能力的同时，往往导致利用优势模态信息的能力下降。因此，我们将模型中的每个模态作为一个基本分类器，并从集成学习的角度解决模态不平衡问题，从而充分利用每个模态的信息。此外，我们引入了负相关学习的概念，以确保不同模式的信息编码的多样性。通过跨多个数据集、使用各种后期融合技术和各种任务进行的实验，我们验证了所提出方法的优越性能，与现有方法相比，精度有显着提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multimodal learning rebalanced: Negative correlation ensembles for improved performance

Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.