{"title":"一种新的医学影像联邦学习框架:结合PCA和早期停止的资源效率方法","authors":"Negin Piran Nanekaran, Eranga Ukwatta","doi":"10.1002/mp.18064","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Federated learning (FL) facilitates collaborative model training across multiple institutions while preserving privacy by avoiding the sharing of raw data, a critical consideration in medical imaging applications. Despite its potential, FL faces challenges such as high-dimensional data, heterogeneity among datasets from different centers, and resource constraints, which limit its efficiency and effectiveness in healthcare settings.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>This study aims to present a novel adaptive FL framework to address the challenges of data heterogeneity and resource constraints in medical imaging. The proposed framework is designed to optimize computational efficiency, enhance training processes, improve model performance, and ensure robustness against non-independent and identically distributed (non-IID) data across decentralized data sources.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>The proposed adaptive FL framework addresses the challenges of high-dimensional data and heterogeneity in nonuniform and decentralized data sources through a key innovation. First, Federated incremental principal component analysis (FIPCA) achieves privacy-preserving dimensionality reduction by aggregating local scatter matrices and means from participating centers, enabling the computation of a global PCA model. This process ensures data alignment across centers, mitigates heterogeneity, and significantly reduces computational complexity. We evaluated the framework's ability to generalize across institutions in a cross-site classification task distinguishing clinically significant prostate cancer (csPCa) from non-csPCa. This assessment used 1500 T2-weighted (T2W) prostate MRI images from three institutions, where two centers (800 + 350 cases) were used for training and validation, and one center (350 cases) served as an independent test site.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The proposed method significantly reduced the number of global training rounds from 200 to 38, achieving a 98% reduction in energy consumption compared to the standard FedAvg algorithm. The effective use of FIPCA for dimensionality reduction enhanced generalizability, while adaptive early stopping prevented overfitting, leading to an improvement in model performance, with the area under the curve (AUC) on the unseen test center increasing from 0.68 to 0.73 (95 % CI 0.70 – 0.77) on the test center's data. Additionally, the method demonstrated improved sensitivity and specificity, indicating superior classification performance. The integration of FIPCA accelerated convergence by reducing data dimensionality, while the adaptive early-stopping mechanism further optimized resource utilization and prevented overfitting.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>Our adaptive FL approach efficiently handles large, heterogeneous medical imaging data, reducing training time and computational overhead, while improving model accuracy. The substantial reduction in energy consumption and accelerated convergence make it suitable for real-world healthcare settings.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 8","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://aapm.onlinelibrary.wiley.com/doi/epdf/10.1002/mp.18064","citationCount":"0","resultStr":"{\"title\":\"A novel federated learning framework for medical imaging: Resource-efficient approach combining PCA with early stopping\",\"authors\":\"Negin Piran Nanekaran, Eranga Ukwatta\",\"doi\":\"10.1002/mp.18064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Federated learning (FL) facilitates collaborative model training across multiple institutions while preserving privacy by avoiding the sharing of raw data, a critical consideration in medical imaging applications. Despite its potential, FL faces challenges such as high-dimensional data, heterogeneity among datasets from different centers, and resource constraints, which limit its efficiency and effectiveness in healthcare settings.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>This study aims to present a novel adaptive FL framework to address the challenges of data heterogeneity and resource constraints in medical imaging. The proposed framework is designed to optimize computational efficiency, enhance training processes, improve model performance, and ensure robustness against non-independent and identically distributed (non-IID) data across decentralized data sources.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>The proposed adaptive FL framework addresses the challenges of high-dimensional data and heterogeneity in nonuniform and decentralized data sources through a key innovation. First, Federated incremental principal component analysis (FIPCA) achieves privacy-preserving dimensionality reduction by aggregating local scatter matrices and means from participating centers, enabling the computation of a global PCA model. This process ensures data alignment across centers, mitigates heterogeneity, and significantly reduces computational complexity. We evaluated the framework's ability to generalize across institutions in a cross-site classification task distinguishing clinically significant prostate cancer (csPCa) from non-csPCa. This assessment used 1500 T2-weighted (T2W) prostate MRI images from three institutions, where two centers (800 + 350 cases) were used for training and validation, and one center (350 cases) served as an independent test site.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The proposed method significantly reduced the number of global training rounds from 200 to 38, achieving a 98% reduction in energy consumption compared to the standard FedAvg algorithm. The effective use of FIPCA for dimensionality reduction enhanced generalizability, while adaptive early stopping prevented overfitting, leading to an improvement in model performance, with the area under the curve (AUC) on the unseen test center increasing from 0.68 to 0.73 (95 % CI 0.70 – 0.77) on the test center's data. Additionally, the method demonstrated improved sensitivity and specificity, indicating superior classification performance. The integration of FIPCA accelerated convergence by reducing data dimensionality, while the adaptive early-stopping mechanism further optimized resource utilization and prevented overfitting.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>Our adaptive FL approach efficiently handles large, heterogeneous medical imaging data, reducing training time and computational overhead, while improving model accuracy. The substantial reduction in energy consumption and accelerated convergence make it suitable for real-world healthcare settings.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 8\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://aapm.onlinelibrary.wiley.com/doi/epdf/10.1002/mp.18064\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://aapm.onlinelibrary.wiley.com/doi/10.1002/mp.18064\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://aapm.onlinelibrary.wiley.com/doi/10.1002/mp.18064","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
摘要
联邦学习(FL)促进了跨多个机构的协作模型训练,同时通过避免原始数据共享来保护隐私,这是医学成像应用中的一个关键考虑因素。尽管具有潜力,但FL面临着诸如高维数据、来自不同中心的数据集之间的异质性以及资源限制等挑战,这些都限制了其在医疗保健环境中的效率和有效性。本研究旨在提出一种新的自适应FL框架,以解决医学成像中数据异质性和资源限制的挑战。提出的框架旨在优化计算效率,增强训练过程,提高模型性能,并确保跨分散数据源的非独立和同分布(non-IID)数据的鲁棒性。方法提出的自适应FL框架通过关键创新解决了非统一和分散数据源中的高维数据和异构性挑战。首先,联邦增量主成分分析(FIPCA)通过聚合局部散点矩阵和参与中心的均值来实现隐私保护的降维,从而实现全局主成分分析模型的计算。此过程确保了跨中心的数据对齐,减轻了异构性,并显著降低了计算复杂性。我们评估了该框架在区分临床显著性前列腺癌(csPCa)和非csPCa的跨部位分类任务中跨机构推广的能力。本评估使用来自三个机构的1500张t2加权(T2W)前列腺MRI图像,其中两个中心(800 + 350例)用于培训和验证,一个中心(350例)作为独立试验点。结果该方法将全局训练轮数从200轮减少到38轮,与标准fedag算法相比,能耗降低98%。有效使用FIPCA降维增强了泛化能力,而自适应提前停止防止了过拟合,从而提高了模型性能,未见测试中心数据的曲线下面积(AUC)从0.68增加到0.73 (95% CI 0.70 - 0.77)。此外,该方法的灵敏度和特异性均有所提高,表明该方法具有较好的分类性能。融合FIPCA通过降低数据维数加速收敛,自适应早停机制进一步优化资源利用,防止过拟合。结论本方法能有效处理大量异构医学影像数据,减少训练时间和计算开销,同时提高模型精度。能耗的大幅降低和加速融合使其适用于现实世界的医疗保健环境。
A novel federated learning framework for medical imaging: Resource-efficient approach combining PCA with early stopping
Background
Federated learning (FL) facilitates collaborative model training across multiple institutions while preserving privacy by avoiding the sharing of raw data, a critical consideration in medical imaging applications. Despite its potential, FL faces challenges such as high-dimensional data, heterogeneity among datasets from different centers, and resource constraints, which limit its efficiency and effectiveness in healthcare settings.
Purpose
This study aims to present a novel adaptive FL framework to address the challenges of data heterogeneity and resource constraints in medical imaging. The proposed framework is designed to optimize computational efficiency, enhance training processes, improve model performance, and ensure robustness against non-independent and identically distributed (non-IID) data across decentralized data sources.
Methods
The proposed adaptive FL framework addresses the challenges of high-dimensional data and heterogeneity in nonuniform and decentralized data sources through a key innovation. First, Federated incremental principal component analysis (FIPCA) achieves privacy-preserving dimensionality reduction by aggregating local scatter matrices and means from participating centers, enabling the computation of a global PCA model. This process ensures data alignment across centers, mitigates heterogeneity, and significantly reduces computational complexity. We evaluated the framework's ability to generalize across institutions in a cross-site classification task distinguishing clinically significant prostate cancer (csPCa) from non-csPCa. This assessment used 1500 T2-weighted (T2W) prostate MRI images from three institutions, where two centers (800 + 350 cases) were used for training and validation, and one center (350 cases) served as an independent test site.
Results
The proposed method significantly reduced the number of global training rounds from 200 to 38, achieving a 98% reduction in energy consumption compared to the standard FedAvg algorithm. The effective use of FIPCA for dimensionality reduction enhanced generalizability, while adaptive early stopping prevented overfitting, leading to an improvement in model performance, with the area under the curve (AUC) on the unseen test center increasing from 0.68 to 0.73 (95 % CI 0.70 – 0.77) on the test center's data. Additionally, the method demonstrated improved sensitivity and specificity, indicating superior classification performance. The integration of FIPCA accelerated convergence by reducing data dimensionality, while the adaptive early-stopping mechanism further optimized resource utilization and prevented overfitting.
Conclusions
Our adaptive FL approach efficiently handles large, heterogeneous medical imaging data, reducing training time and computational overhead, while improving model accuracy. The substantial reduction in energy consumption and accelerated convergence make it suitable for real-world healthcare settings.
期刊介绍:
Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments
Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.