Federated modality-specific encoders and partially personalized fusion decoder for multimodal brain tumor segmentation

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-08-18 DOI:10.1016/j.media.2025.103759

Hong Liu , Dong Wei , Qian Dai , Xian Wu , Yefeng Zheng , Liansheng Wang

{"title":"Federated modality-specific encoders and partially personalized fusion decoder for multimodal brain tumor segmentation","authors":"Hong Liu , Dong Wei , Qian Dai , Xian Wu , Yefeng Zheng , Liansheng Wang","doi":"10.1016/j.media.2025.103759","DOIUrl":null,"url":null,"abstract":"<div><div>Most existing federated learning (FL) methods for medical image analysis only considered intramodal heterogeneity, limiting their applicability to multimodal imaging applications. In practice, some FL participants may possess only a subset of the complete imaging modalities, posing intermodal heterogeneity as a challenge to effectively training a global model on all participants’ data. Meanwhile, each participant expects a personalized model tailored to its local data characteristics in FL. This work proposes a new FL framework with federated modality-specific encoders and partially personalized multimodal fusion decoders (FedMEPD) to address the two concurrent issues. Specifically, FedMEPD employs an exclusive encoder for each modality to account for the intermodal heterogeneity. While these encoders are fully federated, the decoders are partially personalized to meet individual needs—using the discrepancy between global and local parameter updates to dynamically determine which decoder filters are personalized. Implementation-wise, a server with full-modal data employs a fusion decoder to fuse representations from all modality-specific encoders, thus bridging the modalities to optimize the encoders via backpropagation. Moreover, multiple anchors are extracted from the fused multimodal representations and distributed to the clients in addition to the model parameters. Conversely, the clients with incomplete modalities calibrate their missing-modal representations toward the global full-modal anchors via scaled dot-product cross-attention, making up for the information loss due to absent modalities. FedMEPD is validated on the BraTS 2018 and 2020 multimodal brain tumor segmentation benchmarks. Results show that it outperforms various up-to-date methods for multimodal and personalized FL, and its novel designs are effective.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103759"},"PeriodicalIF":11.8000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525003007","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Most existing federated learning (FL) methods for medical image analysis only considered intramodal heterogeneity, limiting their applicability to multimodal imaging applications. In practice, some FL participants may possess only a subset of the complete imaging modalities, posing intermodal heterogeneity as a challenge to effectively training a global model on all participants’ data. Meanwhile, each participant expects a personalized model tailored to its local data characteristics in FL. This work proposes a new FL framework with federated modality-specific encoders and partially personalized multimodal fusion decoders (FedMEPD) to address the two concurrent issues. Specifically, FedMEPD employs an exclusive encoder for each modality to account for the intermodal heterogeneity. While these encoders are fully federated, the decoders are partially personalized to meet individual needs—using the discrepancy between global and local parameter updates to dynamically determine which decoder filters are personalized. Implementation-wise, a server with full-modal data employs a fusion decoder to fuse representations from all modality-specific encoders, thus bridging the modalities to optimize the encoders via backpropagation. Moreover, multiple anchors are extracted from the fused multimodal representations and distributed to the clients in addition to the model parameters. Conversely, the clients with incomplete modalities calibrate their missing-modal representations toward the global full-modal anchors via scaled dot-product cross-attention, making up for the information loss due to absent modalities. FedMEPD is validated on the BraTS 2018 and 2020 multimodal brain tumor segmentation benchmarks. Results show that it outperforms various up-to-date methods for multimodal and personalized FL, and its novel designs are effective.

Abstract Image

查看原文本刊更多论文

联邦模式特定编码器和部分个性化融合解码器用于多模式脑肿瘤分割

大多数现有的用于医学图像分析的联邦学习（FL）方法只考虑了模态内异质性，限制了它们在多模态成像应用中的适用性。实际上，一些FL参与者可能只拥有完整成像模式的一个子集，这使得多式联运异质性成为有效训练所有参与者数据的全局模型的挑战。同时，每个参与者都希望在FL中定制适合其本地数据特征的个性化模型。本工作提出了一个新的FL框架，其中包含联邦模式特定编码器和部分个性化多模式融合解码器（fedmed），以解决这两个并发问题。具体来说，fedmed为每个模态使用了一个专用编码器来解释多模态的异质性。虽然这些编码器是完全联合的，但解码器是部分个性化的，以满足个人需求——使用全局和本地参数更新之间的差异来动态地确定哪些解码器过滤器是个性化的。在实现方面，具有全模态数据的服务器使用融合解码器来融合来自所有模态特定编码器的表示，从而桥接模态以通过反向传播优化编码器。此外，从融合的多模态表示中提取多个锚点，并将其与模型参数一起分发给客户端。相反，模态不完整的客户端通过缩放点积交叉注意将其缺失模态表征校准为全局全模态锚点，弥补由于缺失模态而造成的信息损失。FedMEPD在BraTS 2018和2020多模态脑肿瘤分割基准上得到验证。结果表明，该方法优于当前各种多模式和个性化FL方法，其新颖的设计是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.