回顾学习：在医疗机构中保护隐私的持续学习的真实世界验证

IF 6.3 2区医学 Q1 BIOLOGY

Computers in biology and medicine Pub Date : 2025-05-08 DOI:10.1016/j.compbiomed.2025.110239

Jaesung Yoo , Sunghyuk Choi , Ye Seul Yang , Suhyeon Kim , Jieun Choi , Dongkyeong Lim , Yaeji Lim , Hyung Joon Joo , Dae Jung Kim , Rae Woong Park , Hyung-Jin Yoon , Kwangsoo Kim

{"title":"回顾学习：在医疗机构中保护隐私的持续学习的真实世界验证","authors":"Jaesung Yoo , Sunghyuk Choi , Ye Seul Yang , Suhyeon Kim , Jieun Choi , Dongkyeong Lim , Yaeji Lim , Hyung Joon Joo , Dae Jung Kim , Rae Woong Park , Hyung-Jin Yoon , Kwangsoo Kim","doi":"10.1016/j.compbiomed.2025.110239","DOIUrl":null,"url":null,"abstract":"<div><div>When a deep learning model is trained sequentially on different datasets, it often forgets the knowledge learned from previous data, a problem known as catastrophic forgetting. This damages the model’s performance on diverse datasets, which is critical in privacy-preserving deep learning (PPDL) applications based on transfer learning (TL). To overcome this, we introduce “review learning” (RevL), a low cost continual learning algorithm for diagnosis prediction using electronic health records (EHR) within a PPDL framework. RevL generates data samples from the model which are used to review knowledge from previous datasets. Six simulated institutional experiments and one real-world experiment involving three medical institutions were conducted to validate RevL, using three binary classification EHR data. In the real-world experiment with data from 106,508 patients, the mean global area under the receiver operating curve was 0.710 for RevL and 0.655 for TL. These results demonstrate RevL’s ability to retain previously learned knowledge and its effectiveness in real-world PPDL scenarios. Our work establishes a realistic pipeline for PPDL research based on model transfers across institutions and highlights the practicality of continual learning in real-world medical settings using private EHR data.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"192 ","pages":"Article 110239"},"PeriodicalIF":6.3000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Review learning: Real world validation of privacy preserving continual learning across medical institutions\",\"authors\":\"Jaesung Yoo , Sunghyuk Choi , Ye Seul Yang , Suhyeon Kim , Jieun Choi , Dongkyeong Lim , Yaeji Lim , Hyung Joon Joo , Dae Jung Kim , Rae Woong Park , Hyung-Jin Yoon , Kwangsoo Kim\",\"doi\":\"10.1016/j.compbiomed.2025.110239\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>When a deep learning model is trained sequentially on different datasets, it often forgets the knowledge learned from previous data, a problem known as catastrophic forgetting. This damages the model’s performance on diverse datasets, which is critical in privacy-preserving deep learning (PPDL) applications based on transfer learning (TL). To overcome this, we introduce “review learning” (RevL), a low cost continual learning algorithm for diagnosis prediction using electronic health records (EHR) within a PPDL framework. RevL generates data samples from the model which are used to review knowledge from previous datasets. Six simulated institutional experiments and one real-world experiment involving three medical institutions were conducted to validate RevL, using three binary classification EHR data. In the real-world experiment with data from 106,508 patients, the mean global area under the receiver operating curve was 0.710 for RevL and 0.655 for TL. These results demonstrate RevL’s ability to retain previously learned knowledge and its effectiveness in real-world PPDL scenarios. Our work establishes a realistic pipeline for PPDL research based on model transfers across institutions and highlights the practicality of continual learning in real-world medical settings using private EHR data.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"192 \",\"pages\":\"Article 110239\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525005906\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525005906","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

当深度学习模型在不同的数据集上进行顺序训练时，它经常会忘记从以前的数据中学到的知识，这是一个被称为灾难性遗忘的问题。这损害了模型在不同数据集上的性能，这对于基于迁移学习（TL）的隐私保护深度学习（PPDL）应用至关重要。为了克服这个问题，我们引入了“复习学习”（RevL），这是一种在PPDL框架内使用电子健康记录（EHR）进行诊断预测的低成本连续学习算法。RevL从模型中生成数据样本，用于检查以前数据集中的知识。利用3个二分类EHR数据，进行了6个模拟机构实验和1个涉及3个医疗机构的真实实验来验证RevL。在真实世界的实验中，来自106,508名患者的数据显示，RevL的受试者工作曲线下的平均整体面积为0.710，TL的受试者工作曲线下的平均整体面积为0.655，这些结果表明RevL能够保留先前学习过的知识，并且在真实世界的PPDL场景中具有有效性。我们的工作建立了一个基于跨机构模型转移的PPDL研究的现实管道，并强调了使用私人电子病历数据在现实世界医疗环境中持续学习的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Review learning: Real world validation of privacy preserving continual learning across medical institutions

查看原文本刊更多论文

Review learning: Real world validation of privacy preserving continual learning across medical institutions

When a deep learning model is trained sequentially on different datasets, it often forgets the knowledge learned from previous data, a problem known as catastrophic forgetting. This damages the model’s performance on diverse datasets, which is critical in privacy-preserving deep learning (PPDL) applications based on transfer learning (TL). To overcome this, we introduce “review learning” (RevL), a low cost continual learning algorithm for diagnosis prediction using electronic health records (EHR) within a PPDL framework. RevL generates data samples from the model which are used to review knowledge from previous datasets. Six simulated institutional experiments and one real-world experiment involving three medical institutions were conducted to validate RevL, using three binary classification EHR data. In the real-world experiment with data from 106,508 patients, the mean global area under the receiver operating curve was 0.710 for RevL and 0.655 for TL. These results demonstrate RevL’s ability to retain previously learned knowledge and its effectiveness in real-world PPDL scenarios. Our work establishes a realistic pipeline for PPDL research based on model transfers across institutions and highlights the practicality of continual learning in real-world medical settings using private EHR data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.