Federated Learning for Renal Tumor Segmentation and Classification on Multi-Center MRI Dataset.

IF 3.5 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Journal of Magnetic Resonance Imaging Pub Date : 2025-09-01 Epub Date: 2025-05-19 DOI:10.1002/jmri.29819

Dat-Thanh Nguyen, Maliha Imami, Lin-Mei Zhao, Jing Wu, Ali Borhani, Alireza Mohseni, Mihir Khunte, Zhusi Zhong, Victoria Shi, Sophie Yao, Yuli Wang, Nicolas Loizou, Alvin C Silva, Paul J Zhang, Zishu Zhang, Zhicheng Jiao, Ihab Kamel, Wei-Hua Liao, Harrison Bai

{"title":"Federated Learning for Renal Tumor Segmentation and Classification on Multi-Center MRI Dataset.","authors":"Dat-Thanh Nguyen, Maliha Imami, Lin-Mei Zhao, Jing Wu, Ali Borhani, Alireza Mohseni, Mihir Khunte, Zhusi Zhong, Victoria Shi, Sophie Yao, Yuli Wang, Nicolas Loizou, Alvin C Silva, Paul J Zhang, Zishu Zhang, Zhicheng Jiao, Ihab Kamel, Wei-Hua Liao, Harrison Bai","doi":"10.1002/jmri.29819","DOIUrl":null,"url":null,"abstract":"Background: Deep learning (DL) models for accurate renal tumor characterization may benefit from multi-center datasets for improved generalizability; however, data-sharing constraints necessitate privacy-preserving solutions like federated learning (FL).Purpose: To assess the performance and reliability of FL for renal tumor segmentation and classification in multi-institutional MRI datasets.Study type: Retrospective multi-center study.Population: A total of 987 patients (403 female) from six hospitals were included for analysis. 73% (723/987) had malignant renal tumors, primarily clear cell carcinoma (n = 509). Patients were split into training (n = 785), validation (n = 104), and test (n = 99) sets, stratified across three simulated institutions.Field strength/sequence: MRI was performed at 1.5 T and 3 T using T2-weighted imaging (T2WI) and contrast-enhanced T1-weighted imaging (CE-T1WI) sequences.Assessment: FL and non-FL approaches used nnU-Net for tumor segmentation and ResNet for its classification. FL-trained models across three simulated institutional clients with central weight aggregation, while the non-FL approach used centralized training on the full dataset.Statistical tests: Segmentation was evaluated using Dice coefficients, and classification between malignant and benign lesions was assessed using accuracy, sensitivity, specificity, and area under the curves (AUCs). FL and non-FL performance was compared using the Wilcoxon test for segmentation Dice and Delong's test for AUC (p < 0.05).Results: No significant difference was observed between FL and non-FL models in segmentation (Dice: 0.43 vs. 0.45, p = 0.202) or classification (AUC: 0.69 vs. 0.64, p = 0.959) on the test set. For classification, no significant difference was observed between the models in accuracy (p = 0.912), sensitivity (p = 0.862), or specificity (p = 0.847) on the test set.Data conclusion: FL demonstrated comparable performance to non-FL approaches in renal tumor segmentation and classification, supporting its potential as a privacy-preserving alternative for multi-institutional DL models.Evidence level: 4.Technical efficacy: Stage 2.","PeriodicalId":16140,"journal":{"name":"Journal of Magnetic Resonance Imaging","volume":" ","pages":"814-824"},"PeriodicalIF":3.5000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Magnetic Resonance Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/jmri.29819","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/19 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Deep learning (DL) models for accurate renal tumor characterization may benefit from multi-center datasets for improved generalizability; however, data-sharing constraints necessitate privacy-preserving solutions like federated learning (FL).

Purpose: To assess the performance and reliability of FL for renal tumor segmentation and classification in multi-institutional MRI datasets.

Study type: Retrospective multi-center study.

Population: A total of 987 patients (403 female) from six hospitals were included for analysis. 73% (723/987) had malignant renal tumors, primarily clear cell carcinoma (n = 509). Patients were split into training (n = 785), validation (n = 104), and test (n = 99) sets, stratified across three simulated institutions.

Field strength/sequence: MRI was performed at 1.5 T and 3 T using T2-weighted imaging (T2WI) and contrast-enhanced T1-weighted imaging (CE-T1WI) sequences.

Assessment: FL and non-FL approaches used nnU-Net for tumor segmentation and ResNet for its classification. FL-trained models across three simulated institutional clients with central weight aggregation, while the non-FL approach used centralized training on the full dataset.

Statistical tests: Segmentation was evaluated using Dice coefficients, and classification between malignant and benign lesions was assessed using accuracy, sensitivity, specificity, and area under the curves (AUCs). FL and non-FL performance was compared using the Wilcoxon test for segmentation Dice and Delong's test for AUC (p < 0.05).

Results: No significant difference was observed between FL and non-FL models in segmentation (Dice: 0.43 vs. 0.45, p = 0.202) or classification (AUC: 0.69 vs. 0.64, p = 0.959) on the test set. For classification, no significant difference was observed between the models in accuracy (p = 0.912), sensitivity (p = 0.862), or specificity (p = 0.847) on the test set.

Data conclusion: FL demonstrated comparable performance to non-FL approaches in renal tumor segmentation and classification, supporting its potential as a privacy-preserving alternative for multi-institutional DL models.

Evidence level: 4.

Technical efficacy: Stage 2.

查看原文本刊更多论文

基于多中心MRI数据集的联合学习肾肿瘤分割与分类。

背景：用于精确肾脏肿瘤表征的深度学习（DL）模型可能受益于多中心数据集，以提高泛化能力；然而，数据共享约束需要像联邦学习（FL）这样的隐私保护解决方案。目的：评估FL在多机构MRI数据集中用于肾脏肿瘤分割和分类的性能和可靠性。研究类型：回顾性多中心研究。人口：来自6家医院的987名患者（403名女性）被纳入分析。73%（723/987）为肾恶性肿瘤，主要为透明细胞癌（n = 509）。患者被分为训练组（n = 785）、验证组（n = 104）和测试组（n = 99），在三个模拟机构中分层。场强/序列：在1.5 T和3 T进行MRI，采用t2加权成像（T2WI）和对比增强t1加权成像（CE-T1WI）序列。评估：FL和非FL方法使用nnU-Net进行肿瘤分割，使用ResNet进行肿瘤分类。fl训练的模型跨三个模拟机构客户使用中心权重聚合，而非fl方法在整个数据集上使用集中训练。统计检验：使用Dice系数评估分割，使用准确性、敏感性、特异性和曲线下面积（auc）评估恶性和良性病变的分类。使用分割骰子的Wilcoxon检验和Delong的AUC检验比较FL和非FL模型的性能(p)结果：在测试集上，FL和非FL模型在分割（Dice: 0.43 vs. 0.45, p = 0.202）或分类（AUC: 0.69 vs. 0.64, p = 0.959）方面没有显著差异。对于分类，两种模型在测试集上的准确率（p = 0.912）、灵敏度（p = 0.862）和特异性（p = 0.847）均无显著差异。数据结论：FL在肾肿瘤分割和分类方面表现出与非FL方法相当的性能，支持其作为多机构DL模型的隐私保护替代方案的潜力。证据等级：4。技术功效：第二阶段。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Magnetic Resonance Imaging 医学-核医学

CiteScore

9.70

自引率

6.80%

发文量

494

审稿时长

2 months

期刊介绍： The Journal of Magnetic Resonance Imaging (JMRI) is an international journal devoted to the timely publication of basic and clinical research, educational and review articles, and other information related to the diagnostic applications of magnetic resonance.