多模态学习再平衡:提高性能的负相关集成

IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zhixian Wang , Tao Zhang , Wu Huang
{"title":"多模态学习再平衡:提高性能的负相关集成","authors":"Zhixian Wang ,&nbsp;Tao Zhang ,&nbsp;Wu Huang","doi":"10.1016/j.neunet.2025.107686","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107686"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal learning rebalanced: Negative correlation ensembles for improved performance\",\"authors\":\"Zhixian Wang ,&nbsp;Tao Zhang ,&nbsp;Wu Huang\",\"doi\":\"10.1016/j.neunet.2025.107686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"190 \",\"pages\":\"Article 107686\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025005660\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025005660","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

多模态学习旨在整合来自不同模态的信息,以克服单模态信息的局限性。最近的研究表明,多模态学习方法往往侧重于优化一个主导模态,导致模型性能发展不完整,有时甚至不如单模态模型,这种现象被称为模态不平衡问题。为了克服这一问题,一些研究在识别主导模态设计的基础上自适应调整梯度或损失函数。然而,它们在增强非优势模态的收敛能力的同时,往往导致利用优势模态信息的能力下降。因此,我们将模型中的每个模态作为一个基本分类器,并从集成学习的角度解决模态不平衡问题,从而充分利用每个模态的信息。此外,我们引入了负相关学习的概念,以确保不同模式的信息编码的多样性。通过跨多个数据集、使用各种后期融合技术和各种任务进行的实验,我们验证了所提出方法的优越性能,与现有方法相比,精度有显着提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multimodal learning rebalanced: Negative correlation ensembles for improved performance
Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信