Xiaofei Liu , Jun Zhang , Larry Liu , Xun Tang , Pei Gao
{"title":"A population-based recalibration method for updating survival neural networks models for cardiovascular risk prediction in United Kingdom and China","authors":"Xiaofei Liu , Jun Zhang , Larry Liu , Xun Tang , Pei Gao","doi":"10.1016/j.jclinepi.2025.111895","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><div>Machine learning algorithms, particularly survival neural networks (SNNs), promise to improve cardiovascular disease (CVD) risk prediction. However, the necessity and approach for recalibrating SNN models across diverse populations remain unclear. We aimed to propose a population-based recalibration method and validate it using two large cohort studies.</div></div><div><h3>Study Design and Setting</h3><div>A total of 347,206 individuals aged 40–74 years without prior CVD from UK Biobank (UKB) were for model training and internal validation and 177,756 individuals from a Chinese cohort study (CHinese Electronic health Records Research in Yinzhou [CHERRY]) were for external validation. Three types of SNN models (DeepSurv, age-specific DeepSurv, and DeepHit) were developed for the 10-year CVD risk prediction and compared to Cox models. These models were recalibrated using the proposed method to adjust for differences in disease incidence between populations based on population-level summarized data, and compared with a traditional method that used individual-level data.</div></div><div><h3>Results</h3><div>All SNN models demonstrated robust discrimination in both UKB and CHERRY validation sets (C-indices>0.720), but underpredicted risk for CHERRY populations by 60%. The population-based recalibration method largely corrected the initial risk underestimation, yielding observed-to-expected (O:E) ratios of 1.080, 1.115, and 1.153 for recalibrated DeepSurv, age-specific DeepSurv, and DeepHit, achieving comparable calibration to individual-based recalibration method (O:E ratios: 1.040, 1.054 for DeepSurv and age-specific DeepSurv). The well-calibrated age-specific DeepSurv and Cox models identified high-risk groups with distinct characteristics, with 79% overlap for women and 63% for men.</div></div><div><h3>Conclusion</h3><div>The proposed method effectively adjusts predictions for survival neural network models using population-level summarized data without modifying the original network, making recalibration essential for applying machine learning models to different populations. The method highlights the clinical potential of SNN models for broader application across diverse regions.</div></div><div><h3>Plain Language Summary</h3><div>Machine learning, particularly neural networks for survival analysis, shows great potential in disease risk prediction but typically requires adjustments, or \"recalibration\" for diverse populations. We proposed a recalibration method using population-level summarized data, rather than individual data, which is often hard to obtain. We derived several survival neural network models for 10-year cardiovascular disease risk prediction in the UK Biobank and validated them in the CHinese Electronic health Records Research in Yinzhou (CHERRY) cohort. Although the models ranked individual risk effectively, they underpredicted actual risk in the Chinese population. The proposed method successfully corrected this underprediction, aligning model outputs with observed risk. This method works for different types of survival neural network models and preserves the original model structure. This advancement enhances the reliability and generalizability of survival neural network models across regions, underscoring the need for model evaluation and recalibration before applying them to new populations. It offers a practical solution for improving their accuracy without needing detailed individual data, thus broadening the clinical application of survival neural network models in disease prevention.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"185 ","pages":"Article 111895"},"PeriodicalIF":5.2000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895435625002288","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives
Machine learning algorithms, particularly survival neural networks (SNNs), promise to improve cardiovascular disease (CVD) risk prediction. However, the necessity and approach for recalibrating SNN models across diverse populations remain unclear. We aimed to propose a population-based recalibration method and validate it using two large cohort studies.
Study Design and Setting
A total of 347,206 individuals aged 40–74 years without prior CVD from UK Biobank (UKB) were for model training and internal validation and 177,756 individuals from a Chinese cohort study (CHinese Electronic health Records Research in Yinzhou [CHERRY]) were for external validation. Three types of SNN models (DeepSurv, age-specific DeepSurv, and DeepHit) were developed for the 10-year CVD risk prediction and compared to Cox models. These models were recalibrated using the proposed method to adjust for differences in disease incidence between populations based on population-level summarized data, and compared with a traditional method that used individual-level data.
Results
All SNN models demonstrated robust discrimination in both UKB and CHERRY validation sets (C-indices>0.720), but underpredicted risk for CHERRY populations by 60%. The population-based recalibration method largely corrected the initial risk underestimation, yielding observed-to-expected (O:E) ratios of 1.080, 1.115, and 1.153 for recalibrated DeepSurv, age-specific DeepSurv, and DeepHit, achieving comparable calibration to individual-based recalibration method (O:E ratios: 1.040, 1.054 for DeepSurv and age-specific DeepSurv). The well-calibrated age-specific DeepSurv and Cox models identified high-risk groups with distinct characteristics, with 79% overlap for women and 63% for men.
Conclusion
The proposed method effectively adjusts predictions for survival neural network models using population-level summarized data without modifying the original network, making recalibration essential for applying machine learning models to different populations. The method highlights the clinical potential of SNN models for broader application across diverse regions.
Plain Language Summary
Machine learning, particularly neural networks for survival analysis, shows great potential in disease risk prediction but typically requires adjustments, or "recalibration" for diverse populations. We proposed a recalibration method using population-level summarized data, rather than individual data, which is often hard to obtain. We derived several survival neural network models for 10-year cardiovascular disease risk prediction in the UK Biobank and validated them in the CHinese Electronic health Records Research in Yinzhou (CHERRY) cohort. Although the models ranked individual risk effectively, they underpredicted actual risk in the Chinese population. The proposed method successfully corrected this underprediction, aligning model outputs with observed risk. This method works for different types of survival neural network models and preserves the original model structure. This advancement enhances the reliability and generalizability of survival neural network models across regions, underscoring the need for model evaluation and recalibration before applying them to new populations. It offers a practical solution for improving their accuracy without needing detailed individual data, thus broadening the clinical application of survival neural network models in disease prevention.
期刊介绍:
The Journal of Clinical Epidemiology strives to enhance the quality of clinical and patient-oriented healthcare research by advancing and applying innovative methods in conducting, presenting, synthesizing, disseminating, and translating research results into optimal clinical practice. Special emphasis is placed on training new generations of scientists and clinical practice leaders.