亚洲人群ASCVD风险预测的机器学习模型-如何验证模型很重要。

IF 1.8 4区医学 Q3 CARDIAC & CARDIOVASCULAR SYSTEMS

Acta Cardiologica Sinica Pub Date : 2023-11-01 DOI:10.6515/ACS.202311_39(6).20230528A

Yu-Chung Hsiao, Chen-Yuan Kuo, Fang-Ju Lin, Yen-Wen Wu, Tsung-Hsien Lin, Hung-I Yeh, Jaw-Wen Chen, Chau-Chung Wu

{"title":"亚洲人群ASCVD风险预测的机器学习模型-如何验证模型很重要。","authors":"Yu-Chung Hsiao, Chen-Yuan Kuo, Fang-Ju Lin, Yen-Wen Wu, Tsung-Hsien Lin, Hung-I Yeh, Jaw-Wen Chen, Chau-Chung Wu","doi":"10.6515/ACS.202311_39(6).20230528A","DOIUrl":null,"url":null,"abstract":"Introduction: Atherosclerotic cardiovascular disease (ASCVD) is prevalent worldwide including Taiwan, however widely accepted tools to assess the risk of ASCVD are lacking in Taiwan. Machine learning models are potentially useful for risk evaluation. In this study we used two cohorts to test the feasibility of machine learning with transfer learning for developing an ASCVD risk prediction model in Taiwan.Methods: Two multi-center observational registry cohorts, T-SPARCLE and T-PPARCLE were used in this study. The variables selected were based on European, U.S. and Asian guidelines. Both registries recorded the ASCVD outcomes of the patients. Ten-fold validation and temporal validation methods were used to evaluate the performance of the binary classification analysis [prediction of major adverse cardiovascular (CV) events in one year]. Time-to-event analyses were also performed.Results: In the binary classification analysis, eXtreme Gradient Boosting (XGBoost) and random forest had the best performance, with areas under the receiver operating characteristic curve (AUC-ROC) of 0.72 (0.68-0.76) and 0.73 (0.69-0.77), respectively, although it was not significantly better than other models. Temporal validation was also performed, and the data showed significant differences in the distribution of various features and event rate. The AUC-ROC of XGBoost dropped to 0.66 (0.59-0.73), while that of random forest dropped to 0.69 (0.62-0.76) in the temporal validation method, and the performance also became numerically worse than that of the logistic regression model. In the time-to-event analysis, most models had a concordance index of around 0.70.Conclusions: Machine learning models with appropriate transfer learning may be a useful tool for the development of CV risk prediction models and may help improve patient care in the future.","PeriodicalId":6957,"journal":{"name":"Acta Cardiologica Sinica","volume":"39 6","pages":"901-912"},"PeriodicalIF":1.8000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10646597/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Models for ASCVD Risk Prediction in an Asian Population - How to Validate the Model is Important.\",\"authors\":\"Yu-Chung Hsiao, Chen-Yuan Kuo, Fang-Ju Lin, Yen-Wen Wu, Tsung-Hsien Lin, Hung-I Yeh, Jaw-Wen Chen, Chau-Chung Wu\",\"doi\":\"10.6515/ACS.202311_39(6).20230528A\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Atherosclerotic cardiovascular disease (ASCVD) is prevalent worldwide including Taiwan, however widely accepted tools to assess the risk of ASCVD are lacking in Taiwan. Machine learning models are potentially useful for risk evaluation. In this study we used two cohorts to test the feasibility of machine learning with transfer learning for developing an ASCVD risk prediction model in Taiwan.Methods: Two multi-center observational registry cohorts, T-SPARCLE and T-PPARCLE were used in this study. The variables selected were based on European, U.S. and Asian guidelines. Both registries recorded the ASCVD outcomes of the patients. Ten-fold validation and temporal validation methods were used to evaluate the performance of the binary classification analysis [prediction of major adverse cardiovascular (CV) events in one year]. Time-to-event analyses were also performed.Results: In the binary classification analysis, eXtreme Gradient Boosting (XGBoost) and random forest had the best performance, with areas under the receiver operating characteristic curve (AUC-ROC) of 0.72 (0.68-0.76) and 0.73 (0.69-0.77), respectively, although it was not significantly better than other models. Temporal validation was also performed, and the data showed significant differences in the distribution of various features and event rate. The AUC-ROC of XGBoost dropped to 0.66 (0.59-0.73), while that of random forest dropped to 0.69 (0.62-0.76) in the temporal validation method, and the performance also became numerically worse than that of the logistic regression model. In the time-to-event analysis, most models had a concordance index of around 0.70.Conclusions: Machine learning models with appropriate transfer learning may be a useful tool for the development of CV risk prediction models and may help improve patient care in the future.\",\"PeriodicalId\":6957,\"journal\":{\"name\":\"Acta Cardiologica Sinica\",\"volume\":\"39 6\",\"pages\":\"901-912\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10646597/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Cardiologica Sinica\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.6515/ACS.202311_39(6).20230528A\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Cardiologica Sinica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.6515/ACS.202311_39(6).20230528A","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

简介:动脉粥样硬化性心血管疾病(ASCVD)在包括台湾在内的世界范围内普遍存在，但台湾缺乏广泛接受的评估ASCVD风险的工具。机器学习模型可能对风险评估有用。在本研究中，我们使用两个队列来测试机器学习与迁移学习在台湾发展ASCVD风险预测模型的可行性。方法:本研究采用T-SPARCLE和T-PPARCLE两个多中心观察登记队列。所选择的变量是基于欧洲、美国和亚洲的指导方针。两个登记处都记录了患者的ASCVD结果。采用十倍验证和时间验证方法来评估二元分类分析[预测一年内主要心血管不良事件]的性能。还执行了时间到事件的分析。结果:在二元分类分析中，极端梯度增强(eXtreme Gradient Boosting, XGBoost)模型和随机森林模型表现最好，其受试者工作特征曲线下面积(AUC-ROC)分别为0.72(0.68-0.76)和0.73(0.69-0.77)，但其优于其他模型的效果并不显著。同时进行了时间验证，数据在各种特征的分布和事件率上存在显著差异。在时间验证方法中，XGBoost的AUC-ROC降至0.66(0.59-0.73)，而随机森林的AUC-ROC降至0.69(0.62-0.76)，在数值上也不如logistic回归模型。在时间-事件分析中，大多数模型的一致性指数在0.70左右。结论:具有适当迁移学习的机器学习模型可能是开发CV风险预测模型的有用工具，并可能有助于改善未来的患者护理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine Learning Models for ASCVD Risk Prediction in an Asian Population - How to Validate the Model is Important.

Introduction: Atherosclerotic cardiovascular disease (ASCVD) is prevalent worldwide including Taiwan, however widely accepted tools to assess the risk of ASCVD are lacking in Taiwan. Machine learning models are potentially useful for risk evaluation. In this study we used two cohorts to test the feasibility of machine learning with transfer learning for developing an ASCVD risk prediction model in Taiwan.

Methods: Two multi-center observational registry cohorts, T-SPARCLE and T-PPARCLE were used in this study. The variables selected were based on European, U.S. and Asian guidelines. Both registries recorded the ASCVD outcomes of the patients. Ten-fold validation and temporal validation methods were used to evaluate the performance of the binary classification analysis [prediction of major adverse cardiovascular (CV) events in one year]. Time-to-event analyses were also performed.

Results: In the binary classification analysis, eXtreme Gradient Boosting (XGBoost) and random forest had the best performance, with areas under the receiver operating characteristic curve (AUC-ROC) of 0.72 (0.68-0.76) and 0.73 (0.69-0.77), respectively, although it was not significantly better than other models. Temporal validation was also performed, and the data showed significant differences in the distribution of various features and event rate. The AUC-ROC of XGBoost dropped to 0.66 (0.59-0.73), while that of random forest dropped to 0.69 (0.62-0.76) in the temporal validation method, and the performance also became numerically worse than that of the logistic regression model. In the time-to-event analysis, most models had a concordance index of around 0.70.

Conclusions: Machine learning models with appropriate transfer learning may be a useful tool for the development of CV risk prediction models and may help improve patient care in the future.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Acta Cardiologica Sinica 医学-心血管系统

CiteScore

2.90

自引率

15.80%

发文量

144

审稿时长

>12 weeks

期刊介绍： Acta Cardiologica Sinica welcomes all the papers in the fields related to cardiovascular medicine including basic research, vascular biology, clinical pharmacology, clinical trial, critical care medicine, coronary artery disease, interventional cardiology, arrythmia and electrophysiology, atherosclerosis, hypertension, cardiomyopathy and heart failure, valvular and structure cardiac disease, pediatric cardiology, cardiovascular surgery, and so on. We received papers from more than 20 countries and areas of the world. Currently, 40% of the papers were submitted to Acta Cardiologica Sinica from Taiwan, 20% from China, and 20% from the other countries and areas in the world. The acceptance rate for publication was around 50% in general.