Advancing Privacy-Preserving Health Care Analytics and Implementation of the Personal Health Train: Federated Deep Learning Study.

JMIR AI Pub Date : 2025-02-06 DOI:10.2196/60847

Ananya Choudhury, Leroy Volmer, Frank Martin, Rianne Fijten, Leonard Wee, Andre Dekker, Johan van Soest

{"title":"Advancing Privacy-Preserving Health Care Analytics and Implementation of the Personal Health Train: Federated Deep Learning Study.","authors":"Ananya Choudhury, Leroy Volmer, Frank Martin, Rianne Fijten, Leonard Wee, Andre Dekker, Johan van Soest","doi":"10.2196/60847","DOIUrl":null,"url":null,"abstract":"Background: The rapid advancement of deep learning in health care presents significant opportunities for automating complex medical tasks and improving clinical workflows. However, widespread adoption is impeded by data privacy concerns and the necessity for large, diverse datasets across multiple institutions. Federated learning (FL) has emerged as a viable solution, enabling collaborative artificial intelligence model development without sharing individual patient data. To effectively implement FL in health care, robust and secure infrastructures are essential. Developing such federated deep learning frameworks is crucial to harnessing the full potential of artificial intelligence while ensuring patient data privacy and regulatory compliance.Objective: The objective is to introduce an innovative FL infrastructure called the Personal Health Train (PHT) that includes the procedural, technical, and governance components needed to implement FL on real-world health care data, including training deep learning neural networks. The study aims to apply this federated deep learning infrastructure to the use case of gross tumor volume segmentation on chest computed tomography images of patients with lung cancer and present the results from a proof-of-concept experiment.Methods: The PHT framework addresses the challenges of data privacy when sharing data, by keeping data close to the source and instead bringing the analysis to the data. Technologically, PHT requires 3 interdependent components: \"tracks\" (protected communication channels), \"trains\" (containerized software apps), and \"stations\" (institutional data repositories), which are supported by the open source \"Vantage6\" software. The study applies this federated deep learning infrastructure to the use case of gross tumor volume segmentation on chest computed tomography images of patients with lung cancer, with the introduction of an additional component called the secure aggregation server, where the model averaging is done in a trusted and inaccessible environment.Results: We demonstrated the feasibility of executing deep learning algorithms in a federated manner using PHT and presented the results from a proof-of-concept study. The infrastructure linked 12 hospitals across 8 nations, covering 4 continents, demonstrating the scalability and global reach of the proposed approach. During the execution and training of the deep learning algorithm, no data were shared outside the hospital.Conclusions: The findings of the proof-of-concept study, as well as the implications and limitations of the infrastructure and the results, are discussed. The application of federated deep learning to unstructured medical imaging data, facilitated by the PHT framework and Vantage6 platform, represents a significant advancement in the field. The proposed infrastructure addresses the challenges of data privacy and enables collaborative model development, paving the way for the widespread adoption of deep learning-based tools in the medical domain and beyond. The introduction of the secure aggregation server implied that data leakage problems in FL can be prevented by careful design decisions of the infrastructure.Trial registration: ClinicalTrials.gov NCT05775068; https://clinicaltrials.gov/study/NCT05775068.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e60847"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11843053/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/60847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The rapid advancement of deep learning in health care presents significant opportunities for automating complex medical tasks and improving clinical workflows. However, widespread adoption is impeded by data privacy concerns and the necessity for large, diverse datasets across multiple institutions. Federated learning (FL) has emerged as a viable solution, enabling collaborative artificial intelligence model development without sharing individual patient data. To effectively implement FL in health care, robust and secure infrastructures are essential. Developing such federated deep learning frameworks is crucial to harnessing the full potential of artificial intelligence while ensuring patient data privacy and regulatory compliance.

Objective: The objective is to introduce an innovative FL infrastructure called the Personal Health Train (PHT) that includes the procedural, technical, and governance components needed to implement FL on real-world health care data, including training deep learning neural networks. The study aims to apply this federated deep learning infrastructure to the use case of gross tumor volume segmentation on chest computed tomography images of patients with lung cancer and present the results from a proof-of-concept experiment.

Methods: The PHT framework addresses the challenges of data privacy when sharing data, by keeping data close to the source and instead bringing the analysis to the data. Technologically, PHT requires 3 interdependent components: "tracks" (protected communication channels), "trains" (containerized software apps), and "stations" (institutional data repositories), which are supported by the open source "Vantage6" software. The study applies this federated deep learning infrastructure to the use case of gross tumor volume segmentation on chest computed tomography images of patients with lung cancer, with the introduction of an additional component called the secure aggregation server, where the model averaging is done in a trusted and inaccessible environment.

Results: We demonstrated the feasibility of executing deep learning algorithms in a federated manner using PHT and presented the results from a proof-of-concept study. The infrastructure linked 12 hospitals across 8 nations, covering 4 continents, demonstrating the scalability and global reach of the proposed approach. During the execution and training of the deep learning algorithm, no data were shared outside the hospital.

Conclusions: The findings of the proof-of-concept study, as well as the implications and limitations of the infrastructure and the results, are discussed. The application of federated deep learning to unstructured medical imaging data, facilitated by the PHT framework and Vantage6 platform, represents a significant advancement in the field. The proposed infrastructure addresses the challenges of data privacy and enables collaborative model development, paving the way for the widespread adoption of deep learning-based tools in the medical domain and beyond. The introduction of the secure aggregation server implied that data leakage problems in FL can be prevented by careful design decisions of the infrastructure.

Trial registration: ClinicalTrials.gov NCT05775068; https://clinicaltrials.gov/study/NCT05775068.

查看原文本刊更多论文

推进隐私保护医疗保健分析和个人健康培训的实施：联邦深度学习研究。

背景：深度学习在医疗保健领域的快速发展为自动化复杂的医疗任务和改善临床工作流程提供了重要的机会。然而，数据隐私问题和跨多个机构的大型、多样化数据集的必要性阻碍了广泛采用。联邦学习（FL）已经成为一种可行的解决方案，可以在不共享个人患者数据的情况下实现协作式人工智能模型开发。为了在卫生保健领域有效实施FL，健全和安全的基础设施至关重要。开发这种联合深度学习框架对于充分利用人工智能的潜力，同时确保患者数据隐私和法规遵从性至关重要。目标：目标是引入一种创新的FL基础设施，称为个人健康培训（PHT），其中包括在现实世界的医疗保健数据上实施FL所需的程序、技术和治理组件，包括训练深度学习神经网络。该研究旨在将这种联合深度学习基础设施应用于肺癌患者胸部计算机断层扫描图像的总体肿瘤体积分割用例，并展示概念验证实验的结果。方法：PHT框架在共享数据时解决了数据隐私的挑战，方法是将数据保持在数据源附近，而不是对数据进行分析。从技术上讲，PHT需要3个相互依赖的组件：“轨道”（受保护的通信通道）、“列车”（容器化的软件应用程序）和“站点”（机构数据存储库），这些组件由开源的“Vantage6”软件支持。该研究将这种联合深度学习基础设施应用于肺癌患者胸部计算机断层扫描图像的总体肿瘤体积分割用例，并引入了一个称为安全聚合服务器的附加组件，其中模型平均是在可信且不可访问的环境中完成的。结果：我们展示了使用PHT以联合方式执行深度学习算法的可行性，并展示了概念验证研究的结果。该基础设施连接了覆盖四大洲的8个国家的12家医院，显示了拟议方法的可扩展性和全球影响力。在深度学习算法的执行和训练过程中，没有在医院外共享数据。结论：讨论了概念验证研究的结果，以及基础设施和结果的含义和局限性。在PHT框架和Vantage6平台的推动下，联邦深度学习在非结构化医学成像数据中的应用代表了该领域的重大进步。拟议的基础设施解决了数据隐私方面的挑战，并实现了协作模型开发，为在医疗领域及其他领域广泛采用基于深度学习的工具铺平了道路。安全聚合服务器的引入意味着FL中的数据泄漏问题可以通过基础设施的精心设计决策来防止。试验注册：ClinicalTrials.gov NCT05775068；https://clinicaltrials.gov/study/NCT05775068。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR AI

自引率

0.00%

发文量