Haoyue Song;Jiacheng Wang;Jianjun Zhou;Liansheng Wang
{"title":"从整体上解决异构多模态联邦学习的模态-异构客户端漂移","authors":"Haoyue Song;Jiacheng Wang;Jianjun Zhou;Liansheng Wang","doi":"10.1109/TMI.2024.3523378","DOIUrl":null,"url":null,"abstract":"Multimodal Federated Learning (MFL) has emerged as a collaborative paradigm for training models across decentralized devices, harnessing various data modalities to facilitate effective learning while respecting data ownership. In this realm, notably, a pivotal shift from homogeneous to heterogeneous MFL has taken place. While the former assumes uniformity in input modalities across clients, the latter accommodates modality-incongruous setups, which is often the case in practical situations. For example, while some advanced medical institutions have the luxury of utilizing both MRI and CT for disease diagnosis, remote hospitals often find themselves constrained to employ CT exclusively due to its cost-effectiveness. Although heterogeneous MFL can apply to a broader scenario, it introduces a new challenge: modality-heterogeneous client drift, arising from diverse modality-coupled local optimization. To address this, we introduce FedMM, a simple yet effective approach. During local optimization, FedMM employs modality dropout, randomly masking available modalities, and promoting weight alignment while preserving model expressivity on its original modality combination. To enhance the modality dropout process, FedMM incorporates a task-specific inter- and intra-modal regularizer, which acts as an additional constraint, forcing that weight distribution remains more consistent across diverse input modalities and therefore eases the optimization process with modality dropout enabled. By combining them, our approach holistically addresses client drift. It fosters convergence among client models while considering each client’s unique input modalities, enhancing heterogeneous MFL performance. Comprehensive evaluations in three medical image segmentation datasets demonstrate FedMM’s superiority over state-of-the-art heterogeneous MFL methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 4","pages":"1931-1941"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tackling Modality-Heterogeneous Client Drift Holistically for Heterogeneous Multimodal Federated Learning\",\"authors\":\"Haoyue Song;Jiacheng Wang;Jianjun Zhou;Liansheng Wang\",\"doi\":\"10.1109/TMI.2024.3523378\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimodal Federated Learning (MFL) has emerged as a collaborative paradigm for training models across decentralized devices, harnessing various data modalities to facilitate effective learning while respecting data ownership. In this realm, notably, a pivotal shift from homogeneous to heterogeneous MFL has taken place. While the former assumes uniformity in input modalities across clients, the latter accommodates modality-incongruous setups, which is often the case in practical situations. For example, while some advanced medical institutions have the luxury of utilizing both MRI and CT for disease diagnosis, remote hospitals often find themselves constrained to employ CT exclusively due to its cost-effectiveness. Although heterogeneous MFL can apply to a broader scenario, it introduces a new challenge: modality-heterogeneous client drift, arising from diverse modality-coupled local optimization. To address this, we introduce FedMM, a simple yet effective approach. During local optimization, FedMM employs modality dropout, randomly masking available modalities, and promoting weight alignment while preserving model expressivity on its original modality combination. To enhance the modality dropout process, FedMM incorporates a task-specific inter- and intra-modal regularizer, which acts as an additional constraint, forcing that weight distribution remains more consistent across diverse input modalities and therefore eases the optimization process with modality dropout enabled. By combining them, our approach holistically addresses client drift. It fosters convergence among client models while considering each client’s unique input modalities, enhancing heterogeneous MFL performance. Comprehensive evaluations in three medical image segmentation datasets demonstrate FedMM’s superiority over state-of-the-art heterogeneous MFL methods.\",\"PeriodicalId\":94033,\"journal\":{\"name\":\"IEEE transactions on medical imaging\",\"volume\":\"44 4\",\"pages\":\"1931-1941\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10816602/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10816602/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tackling Modality-Heterogeneous Client Drift Holistically for Heterogeneous Multimodal Federated Learning
Multimodal Federated Learning (MFL) has emerged as a collaborative paradigm for training models across decentralized devices, harnessing various data modalities to facilitate effective learning while respecting data ownership. In this realm, notably, a pivotal shift from homogeneous to heterogeneous MFL has taken place. While the former assumes uniformity in input modalities across clients, the latter accommodates modality-incongruous setups, which is often the case in practical situations. For example, while some advanced medical institutions have the luxury of utilizing both MRI and CT for disease diagnosis, remote hospitals often find themselves constrained to employ CT exclusively due to its cost-effectiveness. Although heterogeneous MFL can apply to a broader scenario, it introduces a new challenge: modality-heterogeneous client drift, arising from diverse modality-coupled local optimization. To address this, we introduce FedMM, a simple yet effective approach. During local optimization, FedMM employs modality dropout, randomly masking available modalities, and promoting weight alignment while preserving model expressivity on its original modality combination. To enhance the modality dropout process, FedMM incorporates a task-specific inter- and intra-modal regularizer, which acts as an additional constraint, forcing that weight distribution remains more consistent across diverse input modalities and therefore eases the optimization process with modality dropout enabled. By combining them, our approach holistically addresses client drift. It fosters convergence among client models while considering each client’s unique input modalities, enhancing heterogeneous MFL performance. Comprehensive evaluations in three medical image segmentation datasets demonstrate FedMM’s superiority over state-of-the-art heterogeneous MFL methods.