Enhancing cardiovascular risk prediction in Asian populations: A machine learning approach integrated with digital health platforms

IF 2.1 Q3 PERIPHERAL VASCULAR DISEASE

International Journal of Cardiology Cardiovascular Risk and Prevention Pub Date : 2025-09-09 DOI:10.1016/j.ijcrp.2025.200509

Sazzli Kasim , Putri Nur Fatin Amir Rudin , Xue Ning Kiew , Nurulain Ibrahim , Nafiza Mat Nasir , Lim Bing Feng , Hanis Hamidi , Khairul Shafiq Ibrahim , Raja Ezman Raja Shariff , Suraya Abdul-Razak , Kazuaki Negishi , Sorayya Malek

{"title":"Enhancing cardiovascular risk prediction in Asian populations: A machine learning approach integrated with digital health platforms","authors":"Sazzli Kasim , Putri Nur Fatin Amir Rudin , Xue Ning Kiew , Nurulain Ibrahim , Nafiza Mat Nasir , Lim Bing Feng , Hanis Hamidi , Khairul Shafiq Ibrahim , Raja Ezman Raja Shariff , Suraya Abdul-Razak , Kazuaki Negishi , Sorayya Malek","doi":"10.1016/j.ijcrp.2025.200509","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><div>This study aimed to develop and validate a machine learning (ML)–based model for cardiovascular disease (CVD) risk prediction in a Malaysian cohort representative of the Southeast Asian population.</div></div><div><h3>Methods</h3><div>Data from the Responding to Increasing Cardiovascular Disease Prevalence (REDISCOVER) Study, including 10,044 participants, were analyzed, with 4,299 cases retained after exclusions. The dataset was split into training (70 %) and validation (30 %) subsets. Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM) models were developed using feature selection techniques such as recursive feature elimination (RFE) and sequential backward elimination (SBE). Model performance was evaluated using the area under the curve (AUC), sensitivity, specificity, calibration, and Net Reclassification Index (NRI).</div></div><div><h3>Findings</h3><div>Among the models evaluated, the SVM model with SBE-selected features performed best, achieving an AUC of 0.800. This was higher than the Framingham Risk Score (FRS; AUC = 0.693), Revised Pooled Cohort Equations (RPCE; AUC = 0.744), and WHO CVD charts (AUC = 0.741). NRI analysis showed significant improvements compared to FRS and RPCE (17.29 % and 14.23 %, respectively; p < 0.00001). Calibration analyses indicated initial overprediction by ML models, which was mitigated by Platt scaling.</div></div><div><h3>Conclusion</h3><div>ML-based models incorporating regionally relevant variables demonstrated improved discrimination and reclassification compared with conventional risk scores in this Malaysian cohort. Further external validation is needed to establish their utility across broader Southeast Asian populations.</div></div>","PeriodicalId":29726,"journal":{"name":"International Journal of Cardiology Cardiovascular Risk and Prevention","volume":"27 ","pages":"Article 200509"},"PeriodicalIF":2.1000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Cardiology Cardiovascular Risk and Prevention","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772487525001473","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PERIPHERAL VASCULAR DISEASE","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives

This study aimed to develop and validate a machine learning (ML)–based model for cardiovascular disease (CVD) risk prediction in a Malaysian cohort representative of the Southeast Asian population.

Methods

Data from the Responding to Increasing Cardiovascular Disease Prevalence (REDISCOVER) Study, including 10,044 participants, were analyzed, with 4,299 cases retained after exclusions. The dataset was split into training (70 %) and validation (30 %) subsets. Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM) models were developed using feature selection techniques such as recursive feature elimination (RFE) and sequential backward elimination (SBE). Model performance was evaluated using the area under the curve (AUC), sensitivity, specificity, calibration, and Net Reclassification Index (NRI).

Findings

Among the models evaluated, the SVM model with SBE-selected features performed best, achieving an AUC of 0.800. This was higher than the Framingham Risk Score (FRS; AUC = 0.693), Revised Pooled Cohort Equations (RPCE; AUC = 0.744), and WHO CVD charts (AUC = 0.741). NRI analysis showed significant improvements compared to FRS and RPCE (17.29 % and 14.23 %, respectively; p < 0.00001). Calibration analyses indicated initial overprediction by ML models, which was mitigated by Platt scaling.

Conclusion

ML-based models incorporating regionally relevant variables demonstrated improved discrimination and reclassification compared with conventional risk scores in this Malaysian cohort. Further external validation is needed to establish their utility across broader Southeast Asian populations.

Abstract Image

查看原文本刊更多论文

加强亚洲人群心血管风险预测：与数字健康平台集成的机器学习方法

本研究旨在开发和验证一种基于机器学习（ML）的模型，用于预测东南亚人群中具有代表性的马来西亚队列的心血管疾病（CVD）风险。方法分析来自心血管疾病患病率增加（REDISCOVER）研究的数据，包括10,044名参与者，排除后保留4,299例。数据集被分成训练子集（70%）和验证子集（30%）。利用递归特征消除（RFE）和顺序向后消除（SBE）等特征选择技术，开发了逻辑回归（LR）、随机森林（RF）和支持向量机（SVM）模型。使用曲线下面积（AUC）、灵敏度、特异性、校准和净再分类指数（NRI）来评估模型的性能。在评估的模型中，sbe选择特征的SVM模型表现最好，AUC为0.800。这高于Framingham风险评分（FRS; AUC = 0.693）、修订合并队列方程（RPCE; AUC = 0.744）和WHO心血管疾病图表（AUC = 0.741）。与FRS和RPCE相比，NRI分析显示显著改善（分别为17.29%和14.23%;p < 0.00001）。校准分析表明ML模型最初的过度预测，这是通过普拉特缩放来缓解的。结论：在这个马来西亚队列中，与传统风险评分相比，纳入区域相关变量的基于ml的模型显示出更好的区分和重新分类。需要进一步的外部验证来确定它们在更广泛的东南亚人群中的效用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊