Haresh Rengaraj Rajamohan, Yanqi Xu, Weicheng Zhu, Richard Kijowski, Kyunghyun Cho, Krzysztof J Geras, Narges Razavian, Cem M Deniz
{"title":"基于诊断知识保存的稳健疾病预后:一种顺序学习方法。","authors":"Haresh Rengaraj Rajamohan, Yanqi Xu, Weicheng Zhu, Richard Kijowski, Kyunghyun Cho, Krzysztof J Geras, Narges Razavian, Cem M Deniz","doi":"10.1101/2025.09.22.25336414","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate disease prognosis is essential for patient care but is often hindered by the lack of long-term data. This study explores deep learning training strategies that utilize large, accessible diagnostic datasets to pretrain models aimed at predicting future disease progression in knee osteoarthritis (OA), Alzheimer's disease (AD), and breast cancer (BC). While diagnostic pretraining improves prognostic task performance, naive fine-tuning for prognosis can cause 'catastrophic forgetting,' where the model's original diagnostic accuracy degrades, a significant patient safety concern in real-world settings. To address this, we propose a sequential learning strategy with experience replay. We used cohorts with knee radiographs, brain MRIs, and digital mammograms to predict 4-year structural worsening in OA, 2-year cognitive decline in AD, and 5-year cancer diagnosis in BC. Our results showed that diagnostic pretraining on larger datasets improved prognosis model performance compared to standard baselines, boosting both the Area Under the Receiver Operating Characteristic curve (AUROC) (e.g., Knee OA external: 0.77 vs 0.747; Breast Cancer: 0.874 vs 0.848) and the Area Under the Precision-Recall Curve (AUPRC) (e.g., Alzheimer's Disease: 0.752 vs 0.683). Additionally, a sequential learning approach with experience replay achieved prognostic performance comparable to dedicated single-task models (e.g., Breast Cancer AUROC 0.876 vs 0.874) while also preserving diagnostic ability. This method maintained high diagnostic accuracy (e.g., Breast Cancer Balanced Accuracy 50.4% vs 50.9% for a dedicated diagnostic model), unlike simpler multitask methods prone to catastrophic forgetting (e.g., 37.7%). Our findings show that leveraging large diagnostic datasets is a reliable and data-efficient way to enhance prognostic models while maintaining essential diagnostic skills.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12486016/pdf/","citationCount":"0","resultStr":"{\"title\":\"Robust Disease Prognosis via Diagnostic Knowledge Preservation: A Sequential Learning Approach.\",\"authors\":\"Haresh Rengaraj Rajamohan, Yanqi Xu, Weicheng Zhu, Richard Kijowski, Kyunghyun Cho, Krzysztof J Geras, Narges Razavian, Cem M Deniz\",\"doi\":\"10.1101/2025.09.22.25336414\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Accurate disease prognosis is essential for patient care but is often hindered by the lack of long-term data. This study explores deep learning training strategies that utilize large, accessible diagnostic datasets to pretrain models aimed at predicting future disease progression in knee osteoarthritis (OA), Alzheimer's disease (AD), and breast cancer (BC). While diagnostic pretraining improves prognostic task performance, naive fine-tuning for prognosis can cause 'catastrophic forgetting,' where the model's original diagnostic accuracy degrades, a significant patient safety concern in real-world settings. To address this, we propose a sequential learning strategy with experience replay. We used cohorts with knee radiographs, brain MRIs, and digital mammograms to predict 4-year structural worsening in OA, 2-year cognitive decline in AD, and 5-year cancer diagnosis in BC. Our results showed that diagnostic pretraining on larger datasets improved prognosis model performance compared to standard baselines, boosting both the Area Under the Receiver Operating Characteristic curve (AUROC) (e.g., Knee OA external: 0.77 vs 0.747; Breast Cancer: 0.874 vs 0.848) and the Area Under the Precision-Recall Curve (AUPRC) (e.g., Alzheimer's Disease: 0.752 vs 0.683). Additionally, a sequential learning approach with experience replay achieved prognostic performance comparable to dedicated single-task models (e.g., Breast Cancer AUROC 0.876 vs 0.874) while also preserving diagnostic ability. This method maintained high diagnostic accuracy (e.g., Breast Cancer Balanced Accuracy 50.4% vs 50.9% for a dedicated diagnostic model), unlike simpler multitask methods prone to catastrophic forgetting (e.g., 37.7%). Our findings show that leveraging large diagnostic datasets is a reliable and data-efficient way to enhance prognostic models while maintaining essential diagnostic skills.</p>\",\"PeriodicalId\":94281,\"journal\":{\"name\":\"medRxiv : the preprint server for health sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12486016/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv : the preprint server for health sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2025.09.22.25336414\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.09.22.25336414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
准确的疾病预后对患者护理至关重要,但往往因缺乏长期数据而受到阻碍。本研究探索了深度学习训练策略,利用大型、可访问的诊断数据集来预训练模型,旨在预测膝关节骨关节炎(OA)、阿尔茨海默病(AD)和乳腺癌(BC)的未来疾病进展。虽然诊断预训练可以提高预后任务的性能,但对预后进行幼稚的微调可能会导致“灾难性遗忘”,从而导致模型的原始诊断准确性下降,这在现实世界中是一个重要的患者安全问题。为了解决这个问题,我们提出了一种带有经验回放的顺序学习策略。我们使用膝关节x线片、脑mri和数字乳房x线片的队列来预测OA患者4年的结构性恶化,AD患者2年的认知能力下降,BC患者5年的癌症诊断。我们的研究结果表明,与标准基线相比,在更大的数据集上进行诊断预训练可以改善预后模型的性能,提高受试者工作特征曲线下面积(AUROC)(例如,膝关节OA外部:0.77 vs 0.747;乳腺癌:0.874 vs 0.848)和精确召回曲线下面积(AUPRC)(例如,阿尔茨海默病:0.752 vs 0.683)。此外,具有经验回放的顺序学习方法在保留诊断能力的同时,实现了与专用单任务模型相当的预后性能(例如,乳腺癌AUROC为0.876 vs 0.874)。该方法保持了较高的诊断准确率(例如,乳腺癌平衡准确率为50.4%,而专用诊断模型为50.9%),而不像简单的多任务方法容易导致灾难性遗忘(例如,37.7%)。我们的研究结果表明,利用大型诊断数据集是一种可靠且数据高效的方法,可以在保持基本诊断技能的同时增强预后模型。作者总结:在我们的研究中,我们解决了医疗人工智能的一个常见问题:在缺乏长期患者数据的情况下,如何准确预测疾病的未来进程。我们的研究重点是膝关节骨关节炎、阿尔茨海默病和乳腺癌。我们发现,通过首先在更大、更常见的数据类型(用于评估患者当前疾病状态的诊断图像)上对模型进行训练,我们可以显著提高模型预测疾病进展的能力。然后,我们开发了一种专门的训练方法,允许单个人工智能模型有效地执行诊断和预测任务。一个关键的挑战是,当模型学习新的预测任务时,往往会“忘记”它们原来的诊断技能。在临床环境中,这会带来安全风险,因为它可能导致漏诊。我们利用经验回放,通过不断刷新模型的诊断知识来克服这个问题。这创建了一个更强大、更有效的模型,反映了临床医生的工作流程,提供了在有限的难以获得的纵向数据基础上改善患者护理的潜力。
Robust Disease Prognosis via Diagnostic Knowledge Preservation: A Sequential Learning Approach.
Accurate disease prognosis is essential for patient care but is often hindered by the lack of long-term data. This study explores deep learning training strategies that utilize large, accessible diagnostic datasets to pretrain models aimed at predicting future disease progression in knee osteoarthritis (OA), Alzheimer's disease (AD), and breast cancer (BC). While diagnostic pretraining improves prognostic task performance, naive fine-tuning for prognosis can cause 'catastrophic forgetting,' where the model's original diagnostic accuracy degrades, a significant patient safety concern in real-world settings. To address this, we propose a sequential learning strategy with experience replay. We used cohorts with knee radiographs, brain MRIs, and digital mammograms to predict 4-year structural worsening in OA, 2-year cognitive decline in AD, and 5-year cancer diagnosis in BC. Our results showed that diagnostic pretraining on larger datasets improved prognosis model performance compared to standard baselines, boosting both the Area Under the Receiver Operating Characteristic curve (AUROC) (e.g., Knee OA external: 0.77 vs 0.747; Breast Cancer: 0.874 vs 0.848) and the Area Under the Precision-Recall Curve (AUPRC) (e.g., Alzheimer's Disease: 0.752 vs 0.683). Additionally, a sequential learning approach with experience replay achieved prognostic performance comparable to dedicated single-task models (e.g., Breast Cancer AUROC 0.876 vs 0.874) while also preserving diagnostic ability. This method maintained high diagnostic accuracy (e.g., Breast Cancer Balanced Accuracy 50.4% vs 50.9% for a dedicated diagnostic model), unlike simpler multitask methods prone to catastrophic forgetting (e.g., 37.7%). Our findings show that leveraging large diagnostic datasets is a reliable and data-efficient way to enhance prognostic models while maintaining essential diagnostic skills.