{"title":"多模态学习再平衡:提高性能的负相关集成","authors":"Zhixian Wang , Tao Zhang , Wu Huang","doi":"10.1016/j.neunet.2025.107686","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107686"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal learning rebalanced: Negative correlation ensembles for improved performance\",\"authors\":\"Zhixian Wang , Tao Zhang , Wu Huang\",\"doi\":\"10.1016/j.neunet.2025.107686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"190 \",\"pages\":\"Article 107686\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025005660\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025005660","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Multimodal learning rebalanced: Negative correlation ensembles for improved performance
Multimodal learning aims to integrate information from different modalities to overcome the limitations of single-modal information. Recent research has shown that multimodal learning methods often focus on optimizing a dominant modality, leading to incomplete model performance development, and sometimes even inferior to single-modal models, a phenomenon referred to as the modality imbalance problem. To overcome this issue, some studies adaptively adjust the gradients or loss functions based on the design of identifying the dominant modality. However, while enhancing the convergence capability of the non-dominant modalities, they often result in a decreased ability to utilize the information from the dominant modality. Therefore, we treat each modality in our model as a basic classifier and address the modality imbalance problem from the perspective of ensemble learning, thus fully leveraging the information from each modality. In addition, we introduce the concept of negative correlation learning to ensure the diversity of information encoding across different modalities. Through experiments carried out across multiple datasets, using various late fusion techniques, and across a variety of tasks, we validated the superior performance of the proposed method, as evidenced by significant improvements in accuracy compared to existing approaches.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.