医疗保健的联邦学习:系统回顾和架构建议

ACM Transactions on Intelligent Systems and Technology (TIST) Pub Date : 2022-02-04 DOI:10.1145/3501813

Rodolfo Stoffel Antunes, Cristiano André da Costa, A. Küderle, Imrana Abdullahi Yari, B. Eskofier

{"title":"医疗保健的联邦学习:系统回顾和架构建议","authors":"Rodolfo Stoffel Antunes, Cristiano André da Costa, A. Küderle, Imrana Abdullahi Yari, B. Eskofier","doi":"10.1145/3501813","DOIUrl":null,"url":null,"abstract":"The use of machine learning (ML) with electronic health records (EHR) is growing in popularity as a means to extract knowledge that can improve the decision-making process in healthcare. Such methods require training of high-quality learning models based on diverse and comprehensive datasets, which are hard to obtain due to the sensitive nature of medical data from patients. In this context, federated learning (FL) is a methodology that enables the distributed training of machine learning models with remotely hosted datasets without the need to accumulate data and, therefore, compromise it. FL is a promising solution to improve ML-based systems, better aligning them to regulatory requirements, improving trustworthiness and data sovereignty. However, many open questions must be addressed before the use of FL becomes widespread. This article aims at presenting a systematic literature review on current research about FL in the context of EHR data for healthcare applications. Our analysis highlights the main research topics, proposed solutions, case studies, and respective ML methods. Furthermore, the article discusses a general architecture for FL applied to healthcare data based on the main insights obtained from the literature review. The collected literature corpus indicates that there is extensive research on the privacy and confidentiality aspects of training data and model sharing, which is expected given the sensitive nature of medical data. Studies also explore improvements to the aggregation mechanisms required to generate the learning model from distributed contributions and case studies with different types of medical data.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"110","resultStr":"{\"title\":\"Federated Learning for Healthcare: Systematic Review and Architecture Proposal\",\"authors\":\"Rodolfo Stoffel Antunes, Cristiano André da Costa, A. Küderle, Imrana Abdullahi Yari, B. Eskofier\",\"doi\":\"10.1145/3501813\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of machine learning (ML) with electronic health records (EHR) is growing in popularity as a means to extract knowledge that can improve the decision-making process in healthcare. Such methods require training of high-quality learning models based on diverse and comprehensive datasets, which are hard to obtain due to the sensitive nature of medical data from patients. In this context, federated learning (FL) is a methodology that enables the distributed training of machine learning models with remotely hosted datasets without the need to accumulate data and, therefore, compromise it. FL is a promising solution to improve ML-based systems, better aligning them to regulatory requirements, improving trustworthiness and data sovereignty. However, many open questions must be addressed before the use of FL becomes widespread. This article aims at presenting a systematic literature review on current research about FL in the context of EHR data for healthcare applications. Our analysis highlights the main research topics, proposed solutions, case studies, and respective ML methods. Furthermore, the article discusses a general architecture for FL applied to healthcare data based on the main insights obtained from the literature review. The collected literature corpus indicates that there is extensive research on the privacy and confidentiality aspects of training data and model sharing, which is expected given the sensitive nature of medical data. Studies also explore improvements to the aggregation mechanisms required to generate the learning model from distributed contributions and case studies with different types of medical data.\",\"PeriodicalId\":123526,\"journal\":{\"name\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"110\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3501813\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology (TIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3501813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 110

摘要

将机器学习(ML)与电子健康记录(EHR)结合使用，作为提取可以改善医疗保健决策过程的知识的一种手段，正越来越受欢迎。这些方法需要基于多样化和全面的数据集训练高质量的学习模型，而由于患者医疗数据的敏感性，这些模型很难获得。在这种情况下，联邦学习(FL)是一种方法，它可以使用远程托管的数据集对机器学习模型进行分布式训练，而不需要积累数据，因此会损害数据。FL是一个很有前途的解决方案，可以改进基于ml的系统，更好地使它们符合监管要求，提高可信度和数据主权。然而，在FL的广泛使用之前，必须解决许多悬而未决的问题。这篇文章的目的是提出一个系统的文献综述，目前研究FL背景下的电子病历数据的医疗保健应用。我们的分析强调了主要的研究课题、提出的解决方案、案例研究和各自的机器学习方法。此外，本文根据从文献综述中获得的主要见解，讨论了应用于医疗保健数据的FL的一般架构。收集到的文献语料库表明，在训练数据和模型共享的隐私和机密性方面进行了广泛的研究，考虑到医疗数据的敏感性，这是预期的。研究还探索了对聚合机制的改进，以便从具有不同类型医疗数据的分布式贡献和案例研究中生成学习模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Federated Learning for Healthcare: Systematic Review and Architecture Proposal

The use of machine learning (ML) with electronic health records (EHR) is growing in popularity as a means to extract knowledge that can improve the decision-making process in healthcare. Such methods require training of high-quality learning models based on diverse and comprehensive datasets, which are hard to obtain due to the sensitive nature of medical data from patients. In this context, federated learning (FL) is a methodology that enables the distributed training of machine learning models with remotely hosted datasets without the need to accumulate data and, therefore, compromise it. FL is a promising solution to improve ML-based systems, better aligning them to regulatory requirements, improving trustworthiness and data sovereignty. However, many open questions must be addressed before the use of FL becomes widespread. This article aims at presenting a systematic literature review on current research about FL in the context of EHR data for healthcare applications. Our analysis highlights the main research topics, proposed solutions, case studies, and respective ML methods. Furthermore, the article discusses a general architecture for FL applied to healthcare data based on the main insights obtained from the literature review. The collected literature corpus indicates that there is extensive research on the privacy and confidentiality aspects of training data and model sharing, which is expected given the sensitive nature of medical data. Studies also explore improvements to the aggregation mechanisms required to generate the learning model from distributed contributions and case studies with different types of medical data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Intelligent Systems and Technology (TIST)

自引率

0.00%

发文量