Lydia A Schoenpflug, Ruben Bagan Benavides, Marta Nowak, Fahime Sheikhzadeh, Arash Moayyedi, Kamil Wasag, Jacob Reimers, Michael Zhou, Raghavan Venugopal, Bettina Sobottka, Yasmin Koeller, Michael Rivers, Holger Moch, Yao Nie, Viktor H Koelzer
{"title":"导航现实世界的挑战:在计算病理学联合学习的案例研究。","authors":"Lydia A Schoenpflug, Ruben Bagan Benavides, Marta Nowak, Fahime Sheikhzadeh, Arash Moayyedi, Kamil Wasag, Jacob Reimers, Michael Zhou, Raghavan Venugopal, Bettina Sobottka, Yasmin Koeller, Michael Rivers, Holger Moch, Yao Nie, Viktor H Koelzer","doi":"10.1016/j.jpi.2025.100464","DOIUrl":null,"url":null,"abstract":"<p><p>Federated learning (FL) allows institutions to collaboratively train deep learning models while maintaining data privacy, a critical aspect in fields like computational pathology (CPATH). However, existing studies focus on performance improvement in simulated environments and overlook practical aspects of FL. In this study, we address this need by transparently sharing the challenges encountered in the real-world application of FL for a clinical CPATH use case. We set up a FL framework consisting of three clients and a central server to jointly train deep learning models for digital immune phenotyping in metastatic melanoma, utilizing the NVIDIA Federated Learning Application Runtime Environment (NVIDIA FLARE) across four separate networks from institutes in four countries. Our findings reveal several key challenges: First, the FL model performs the best across all clients' test sets but does not outperform all local models on their own client test set. Second, long experiment duration due to system and data heterogeneity limited experiment frequency, alleviated by optimizing local client epochs. Third, infrastructure design was hindered by hospital and corporate network restrictions, necessitating an open port for the server, which we resolved by deploying the server on an Amazon Web Services infrastructure within a semi-public network. Lastly, effective experiment management required IT expertise and strong familiarity with NVIDIA FLARE to enable orchestration, code management, parameter configuration, and logging. Our findings provide a practical perspective on implementing FL for CPATH, advocating for greater transparency in future research and the development of best practices and guidelines for implementing FL in real-world healthcare settings.</p>","PeriodicalId":37769,"journal":{"name":"Journal of Pathology Informatics","volume":"18 ","pages":"100464"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357140/pdf/","citationCount":"0","resultStr":"{\"title\":\"Navigating real-world challenges: A case study on federated learning in computational pathology.\",\"authors\":\"Lydia A Schoenpflug, Ruben Bagan Benavides, Marta Nowak, Fahime Sheikhzadeh, Arash Moayyedi, Kamil Wasag, Jacob Reimers, Michael Zhou, Raghavan Venugopal, Bettina Sobottka, Yasmin Koeller, Michael Rivers, Holger Moch, Yao Nie, Viktor H Koelzer\",\"doi\":\"10.1016/j.jpi.2025.100464\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Federated learning (FL) allows institutions to collaboratively train deep learning models while maintaining data privacy, a critical aspect in fields like computational pathology (CPATH). However, existing studies focus on performance improvement in simulated environments and overlook practical aspects of FL. In this study, we address this need by transparently sharing the challenges encountered in the real-world application of FL for a clinical CPATH use case. We set up a FL framework consisting of three clients and a central server to jointly train deep learning models for digital immune phenotyping in metastatic melanoma, utilizing the NVIDIA Federated Learning Application Runtime Environment (NVIDIA FLARE) across four separate networks from institutes in four countries. Our findings reveal several key challenges: First, the FL model performs the best across all clients' test sets but does not outperform all local models on their own client test set. Second, long experiment duration due to system and data heterogeneity limited experiment frequency, alleviated by optimizing local client epochs. Third, infrastructure design was hindered by hospital and corporate network restrictions, necessitating an open port for the server, which we resolved by deploying the server on an Amazon Web Services infrastructure within a semi-public network. Lastly, effective experiment management required IT expertise and strong familiarity with NVIDIA FLARE to enable orchestration, code management, parameter configuration, and logging. Our findings provide a practical perspective on implementing FL for CPATH, advocating for greater transparency in future research and the development of best practices and guidelines for implementing FL in real-world healthcare settings.</p>\",\"PeriodicalId\":37769,\"journal\":{\"name\":\"Journal of Pathology Informatics\",\"volume\":\"18 \",\"pages\":\"100464\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357140/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Pathology Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jpi.2025.100464\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/8/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pathology Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jpi.2025.100464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
摘要
联邦学习(FL)允许机构在保持数据隐私的同时协同训练深度学习模型,这是计算病理学(CPATH)等领域的一个关键方面。然而,现有的研究侧重于模拟环境中的性能改进,而忽视了FL的实际应用。在本研究中,我们通过透明地分享FL在临床CPATH用例中的实际应用中遇到的挑战来解决这一需求。我们建立了一个由三个客户端和一个中央服务器组成的FL框架,利用NVIDIA联邦学习应用运行时环境(NVIDIA FLARE)在来自四个国家的机构的四个独立网络上联合训练用于转移性黑色素瘤数字免疫表型的深度学习模型。我们的发现揭示了几个关键的挑战:首先,FL模型在所有客户的测试集中表现最好,但在他们自己的客户测试集中表现不优于所有本地模型。其次,由于系统和数据的异构性,实验时间长,限制了实验频率,通过优化本地客户端时间可以缓解这一问题。第三,基础设施设计受到医院和企业网络限制的阻碍,需要为服务器提供一个开放端口,我们通过将服务器部署在半公共网络中的Amazon Web Services基础设施上来解决这个问题。最后,有效的实验管理需要IT专业知识和对NVIDIA FLARE的熟悉程度,以实现编排、代码管理、参数配置和日志记录。我们的研究结果为在CPATH中实施FL提供了一个实用的视角,提倡在未来的研究中提高透明度,并为在现实世界的医疗环境中实施FL制定最佳实践和指南。
Navigating real-world challenges: A case study on federated learning in computational pathology.
Federated learning (FL) allows institutions to collaboratively train deep learning models while maintaining data privacy, a critical aspect in fields like computational pathology (CPATH). However, existing studies focus on performance improvement in simulated environments and overlook practical aspects of FL. In this study, we address this need by transparently sharing the challenges encountered in the real-world application of FL for a clinical CPATH use case. We set up a FL framework consisting of three clients and a central server to jointly train deep learning models for digital immune phenotyping in metastatic melanoma, utilizing the NVIDIA Federated Learning Application Runtime Environment (NVIDIA FLARE) across four separate networks from institutes in four countries. Our findings reveal several key challenges: First, the FL model performs the best across all clients' test sets but does not outperform all local models on their own client test set. Second, long experiment duration due to system and data heterogeneity limited experiment frequency, alleviated by optimizing local client epochs. Third, infrastructure design was hindered by hospital and corporate network restrictions, necessitating an open port for the server, which we resolved by deploying the server on an Amazon Web Services infrastructure within a semi-public network. Lastly, effective experiment management required IT expertise and strong familiarity with NVIDIA FLARE to enable orchestration, code management, parameter configuration, and logging. Our findings provide a practical perspective on implementing FL for CPATH, advocating for greater transparency in future research and the development of best practices and guidelines for implementing FL in real-world healthcare settings.
期刊介绍:
The Journal of Pathology Informatics (JPI) is an open access peer-reviewed journal dedicated to the advancement of pathology informatics. This is the official journal of the Association for Pathology Informatics (API). The journal aims to publish broadly about pathology informatics and freely disseminate all articles worldwide. This journal is of interest to pathologists, informaticians, academics, researchers, health IT specialists, information officers, IT staff, vendors, and anyone with an interest in informatics. We encourage submissions from anyone with an interest in the field of pathology informatics. We publish all types of papers related to pathology informatics including original research articles, technical notes, reviews, viewpoints, commentaries, editorials, symposia, meeting abstracts, book reviews, and correspondence to the editors. All submissions are subject to rigorous peer review by the well-regarded editorial board and by expert referees in appropriate specialties.