{"title":"基于机器学习的心血管疾病 10 年风险预测模型的开发:一项前瞻性队列研究。","authors":"Jia You, Yu Guo, Ju-Jiao Kang, Hui-Fu Wang, Ming Yang, Jian-Feng Feng, Jin-Tai Yu, Wei Cheng","doi":"10.1136/svn-2023-002332","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Previous prediction algorithms for cardiovascular diseases (CVD) were established using risk factors retrieved largely based on empirical clinical knowledge. This study sought to identify predictors among a comprehensive variable space, and then employ machine learning (ML) algorithms to develop a novel CVD risk prediction model.</p><p><strong>Methods: </strong>From a longitudinal population-based cohort of UK Biobank, this study included 473 611 CVD-free participants aged between 37 and 73 years old. We implemented an ML-based data-driven pipeline to identify predictors from 645 candidate variables covering a comprehensive range of health-related factors and assessed multiple ML classifiers to establish a risk prediction model on 10-year incident CVD. The model was validated through a leave-one-center-out cross-validation.</p><p><strong>Results: </strong>During a median follow-up of 12.2 years, 31 466 participants developed CVD within 10 years after baseline visits. A novel UK Biobank CVD risk prediction (UKCRP) model was established that comprised 10 predictors including age, sex, medication of cholesterol and blood pressure, cholesterol ratio (total/high-density lipoprotein), systolic blood pressure, previous angina or heart disease, number of medications taken, cystatin C, chest pain and pack-years of smoking. Our model obtained satisfied discriminative performance with an area under the receiver operating characteristic curve (AUC) of 0.762±0.010 that outperformed multiple existing clinical models, and it was well-calibrated with a Brier Score of 0.057±0.006. Further, the UKCRP can obtain comparable performance for myocardial infarction (AUC 0.774±0.011) and ischaemic stroke (AUC 0.730±0.020), but inferior performance for haemorrhagic stroke (AUC 0.644±0.026).</p><p><strong>Conclusion: </strong>ML-based classification models can learn expressive representations from potential high-risked CVD participants who may benefit from earlier clinical decisions.</p>","PeriodicalId":22021,"journal":{"name":"Stroke and Vascular Neurology","volume":" ","pages":"475-485"},"PeriodicalIF":4.4000,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10800279/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development of machine learning-based models to predict 10-year risk of cardiovascular disease: a prospective cohort study.\",\"authors\":\"Jia You, Yu Guo, Ju-Jiao Kang, Hui-Fu Wang, Ming Yang, Jian-Feng Feng, Jin-Tai Yu, Wei Cheng\",\"doi\":\"10.1136/svn-2023-002332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Previous prediction algorithms for cardiovascular diseases (CVD) were established using risk factors retrieved largely based on empirical clinical knowledge. This study sought to identify predictors among a comprehensive variable space, and then employ machine learning (ML) algorithms to develop a novel CVD risk prediction model.</p><p><strong>Methods: </strong>From a longitudinal population-based cohort of UK Biobank, this study included 473 611 CVD-free participants aged between 37 and 73 years old. We implemented an ML-based data-driven pipeline to identify predictors from 645 candidate variables covering a comprehensive range of health-related factors and assessed multiple ML classifiers to establish a risk prediction model on 10-year incident CVD. The model was validated through a leave-one-center-out cross-validation.</p><p><strong>Results: </strong>During a median follow-up of 12.2 years, 31 466 participants developed CVD within 10 years after baseline visits. A novel UK Biobank CVD risk prediction (UKCRP) model was established that comprised 10 predictors including age, sex, medication of cholesterol and blood pressure, cholesterol ratio (total/high-density lipoprotein), systolic blood pressure, previous angina or heart disease, number of medications taken, cystatin C, chest pain and pack-years of smoking. Our model obtained satisfied discriminative performance with an area under the receiver operating characteristic curve (AUC) of 0.762±0.010 that outperformed multiple existing clinical models, and it was well-calibrated with a Brier Score of 0.057±0.006. Further, the UKCRP can obtain comparable performance for myocardial infarction (AUC 0.774±0.011) and ischaemic stroke (AUC 0.730±0.020), but inferior performance for haemorrhagic stroke (AUC 0.644±0.026).</p><p><strong>Conclusion: </strong>ML-based classification models can learn expressive representations from potential high-risked CVD participants who may benefit from earlier clinical decisions.</p>\",\"PeriodicalId\":22021,\"journal\":{\"name\":\"Stroke and Vascular Neurology\",\"volume\":\" \",\"pages\":\"475-485\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2023-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10800279/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Stroke and Vascular Neurology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1136/svn-2023-002332\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stroke and Vascular Neurology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/svn-2023-002332","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
Development of machine learning-based models to predict 10-year risk of cardiovascular disease: a prospective cohort study.
Background: Previous prediction algorithms for cardiovascular diseases (CVD) were established using risk factors retrieved largely based on empirical clinical knowledge. This study sought to identify predictors among a comprehensive variable space, and then employ machine learning (ML) algorithms to develop a novel CVD risk prediction model.
Methods: From a longitudinal population-based cohort of UK Biobank, this study included 473 611 CVD-free participants aged between 37 and 73 years old. We implemented an ML-based data-driven pipeline to identify predictors from 645 candidate variables covering a comprehensive range of health-related factors and assessed multiple ML classifiers to establish a risk prediction model on 10-year incident CVD. The model was validated through a leave-one-center-out cross-validation.
Results: During a median follow-up of 12.2 years, 31 466 participants developed CVD within 10 years after baseline visits. A novel UK Biobank CVD risk prediction (UKCRP) model was established that comprised 10 predictors including age, sex, medication of cholesterol and blood pressure, cholesterol ratio (total/high-density lipoprotein), systolic blood pressure, previous angina or heart disease, number of medications taken, cystatin C, chest pain and pack-years of smoking. Our model obtained satisfied discriminative performance with an area under the receiver operating characteristic curve (AUC) of 0.762±0.010 that outperformed multiple existing clinical models, and it was well-calibrated with a Brier Score of 0.057±0.006. Further, the UKCRP can obtain comparable performance for myocardial infarction (AUC 0.774±0.011) and ischaemic stroke (AUC 0.730±0.020), but inferior performance for haemorrhagic stroke (AUC 0.644±0.026).
Conclusion: ML-based classification models can learn expressive representations from potential high-risked CVD participants who may benefit from earlier clinical decisions.
期刊介绍:
Stroke and Vascular Neurology (SVN) is the official journal of the Chinese Stroke Association. Supported by a team of renowned Editors, and fully Open Access, the journal encourages debate on controversial techniques, issues on health policy and social medicine.