{"title":"Non-IID Medical Image Segmentation Based on Cascaded Diffusion Model for Diverse Multi- Center Scenarios.","authors":"Hanwen Zhang, Mingzhi Chen, Yuxi Liu, Guibo Luo, Yuesheng Zhu","doi":"10.1109/JBHI.2025.3549029","DOIUrl":null,"url":null,"abstract":"<p><p>Learning from multi-center medical datasets to obtain a high-performance global model is challenging due to the privacy protection and data heterogeneity in healthcare systems. Current federated learning approaches are not efficient enough to learn Non-Independent and Identically Distributed (Non-IID) data and require high communication costs. In this work, a practical privacy computing framework is proposed to train a Non-IID medical image segmentation model under various multi-center setting in low communication cost. Specifically, an efficient cascaded diffusion model is trained to generate image-mask pairs that have similar distribution to the training data of clients, providing rich labeled data on client side to mitigate heterogeneity. Also, a label construction module is developed to improve the quality of generated image-mask pairs. Moreover, a set of aggregation methods is proposed to achieve global model from data generated from Cascaded Diffusion model for diverse scenarios: CD-Syn, CD-Ens and its extension CD-KD. CD-Syn is a one-shot method that trains segmentation model solely on public generated datasets while CD-Ens and CD-KD maximize the utilization of local original data by an extra communication round of ensemble or knowledge distillation. In this way, the setting of our proposed framework is highly practical, providing multiple aggregation methods which can flexibly adapt to varying demands for efficiency, privacy, and accuracy. We systematically evaluated the effectiveness of our proposed framework on five Non-IID medical datasets and observe 5.38% improvement in Dice score compared with baseline method (FednnU-Net) on average.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2025.3549029","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Learning from multi-center medical datasets to obtain a high-performance global model is challenging due to the privacy protection and data heterogeneity in healthcare systems. Current federated learning approaches are not efficient enough to learn Non-Independent and Identically Distributed (Non-IID) data and require high communication costs. In this work, a practical privacy computing framework is proposed to train a Non-IID medical image segmentation model under various multi-center setting in low communication cost. Specifically, an efficient cascaded diffusion model is trained to generate image-mask pairs that have similar distribution to the training data of clients, providing rich labeled data on client side to mitigate heterogeneity. Also, a label construction module is developed to improve the quality of generated image-mask pairs. Moreover, a set of aggregation methods is proposed to achieve global model from data generated from Cascaded Diffusion model for diverse scenarios: CD-Syn, CD-Ens and its extension CD-KD. CD-Syn is a one-shot method that trains segmentation model solely on public generated datasets while CD-Ens and CD-KD maximize the utilization of local original data by an extra communication round of ensemble or knowledge distillation. In this way, the setting of our proposed framework is highly practical, providing multiple aggregation methods which can flexibly adapt to varying demands for efficiency, privacy, and accuracy. We systematically evaluated the effectiveness of our proposed framework on five Non-IID medical datasets and observe 5.38% improvement in Dice score compared with baseline method (FednnU-Net) on average.
期刊介绍:
IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.