严重模态缺失下基于跨模态原型的多模态联邦学习

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-04-24 DOI:10.1016/j.inffus.2025.103219

Huy Q. Le , Chu Myaet Thwal , Yu Qiao , Ye Lin Tun , Minh N.H. Nguyen , Eui-Nam Huh , Choong Seon Hong

{"title":"严重模态缺失下基于跨模态原型的多模态联邦学习","authors":"Huy Q. Le , Chu Myaet Thwal , Yu Qiao , Ye Lin Tun , Minh N.H. Nguyen , Eui-Nam Huh , Choong Seon Hong","doi":"10.1016/j.inffus.2025.103219","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal federated learning (MFL) has emerged as a decentralized machine learning paradigm, allowing multiple clients with different modalities to collaborate on training a global model across diverse data sources without sharing their private data. However, challenges, such as data heterogeneity and severely missing modalities, pose crucial hindrances to the robustness of MFL, significantly impacting the performance of global model. The occurrence of missing modalities in real-world applications, such as autonomous driving, often arises from factors like sensor failures, leading knowledge gaps during the training process. Specifically, the absence of a modality introduces misalignment during the local training phase, stemming from zero-filling in the case of clients with missing modalities. Consequently, achieving robust generalization in global model becomes imperative, especially when dealing with clients that have incomplete data. In this paper, we propose <strong>Multimodal Federated Cross Prototype Learning (MFCPL</strong>), a novel approach for MFL under severely missing modalities. Our MFCPL leverages the complete prototypes to provide diverse modality knowledge in modality-shared level with the cross-modal regularization and modality-specific level with cross-modal contrastive mechanism. Additionally, our approach introduces the cross-modal alignment to provide regularization for modality-specific features, thereby enhancing the overall performance, particularly in scenarios involving severely missing modalities. Through extensive experiments on four multimodal datasets, we demonstrate the effectiveness of MFCPL in mitigating the challenges of data heterogeneity and severely missing modalities while improving the overall performance and robustness of MFL.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103219"},"PeriodicalIF":14.7000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-modal prototype based multimodal federated learning under severely missing modality\",\"authors\":\"Huy Q. Le , Chu Myaet Thwal , Yu Qiao , Ye Lin Tun , Minh N.H. Nguyen , Eui-Nam Huh , Choong Seon Hong\",\"doi\":\"10.1016/j.inffus.2025.103219\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multimodal federated learning (MFL) has emerged as a decentralized machine learning paradigm, allowing multiple clients with different modalities to collaborate on training a global model across diverse data sources without sharing their private data. However, challenges, such as data heterogeneity and severely missing modalities, pose crucial hindrances to the robustness of MFL, significantly impacting the performance of global model. The occurrence of missing modalities in real-world applications, such as autonomous driving, often arises from factors like sensor failures, leading knowledge gaps during the training process. Specifically, the absence of a modality introduces misalignment during the local training phase, stemming from zero-filling in the case of clients with missing modalities. Consequently, achieving robust generalization in global model becomes imperative, especially when dealing with clients that have incomplete data. In this paper, we propose <strong>Multimodal Federated Cross Prototype Learning (MFCPL</strong>), a novel approach for MFL under severely missing modalities. Our MFCPL leverages the complete prototypes to provide diverse modality knowledge in modality-shared level with the cross-modal regularization and modality-specific level with cross-modal contrastive mechanism. Additionally, our approach introduces the cross-modal alignment to provide regularization for modality-specific features, thereby enhancing the overall performance, particularly in scenarios involving severely missing modalities. Through extensive experiments on four multimodal datasets, we demonstrate the effectiveness of MFCPL in mitigating the challenges of data heterogeneity and severely missing modalities while improving the overall performance and robustness of MFL.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"122 \",\"pages\":\"Article 103219\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2025-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253525002921\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002921","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多模态联邦学习（MFL）已经成为一种分散的机器学习范式，允许具有不同模式的多个客户端跨不同数据源协作训练全局模型，而无需共享其私有数据。然而，数据异质性和模态严重缺失等问题严重影响了模型的鲁棒性，严重影响了全局模型的性能。在实际应用中，如自动驾驶中，模式缺失的发生通常是由传感器故障、训练过程中的主要知识差距等因素引起的。具体来说，模态的缺失在本地培训阶段引入了错位，这是由于缺少模态的客户的零填充造成的。因此，实现全局模型的鲁棒泛化变得势在必行，特别是在处理具有不完整数据的客户端时。在本文中，我们提出了多模态联邦交叉原型学习（MFCPL），这是一种针对严重缺失模态下的多模态联邦交叉原型学习的新方法。我们的MFCPL利用完整的原型，通过跨模态正则化和跨模态对比机制，在模态共享层面提供多种模态知识。此外，我们的方法引入了跨模态对齐，为模态特定的特征提供正则化，从而提高了整体性能，特别是在涉及严重缺失模态的场景中。通过在四个多模态数据集上的大量实验，我们证明了MFCPL在减轻数据异质性和严重缺失模态挑战的同时提高了MFL的整体性能和鲁棒性的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cross-modal prototype based multimodal federated learning under severely missing modality

Multimodal federated learning (MFL) has emerged as a decentralized machine learning paradigm, allowing multiple clients with different modalities to collaborate on training a global model across diverse data sources without sharing their private data. However, challenges, such as data heterogeneity and severely missing modalities, pose crucial hindrances to the robustness of MFL, significantly impacting the performance of global model. The occurrence of missing modalities in real-world applications, such as autonomous driving, often arises from factors like sensor failures, leading knowledge gaps during the training process. Specifically, the absence of a modality introduces misalignment during the local training phase, stemming from zero-filling in the case of clients with missing modalities. Consequently, achieving robust generalization in global model becomes imperative, especially when dealing with clients that have incomplete data. In this paper, we propose Multimodal Federated Cross Prototype Learning (MFCPL), a novel approach for MFL under severely missing modalities. Our MFCPL leverages the complete prototypes to provide diverse modality knowledge in modality-shared level with the cross-modal regularization and modality-specific level with cross-modal contrastive mechanism. Additionally, our approach introduces the cross-modal alignment to provide regularization for modality-specific features, thereby enhancing the overall performance, particularly in scenarios involving severely missing modalities. Through extensive experiments on four multimodal datasets, we demonstrate the effectiveness of MFCPL in mitigating the challenges of data heterogeneity and severely missing modalities while improving the overall performance and robustness of MFL.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.