Machine-learning-based prediction of cardiovascular events for hyperlipidemia population with lipid variability and remnant cholesterol as biomarkers.

IF 4.7 3区 医学 Q1 MEDICAL INFORMATICS
Health Information Science and Systems Pub Date : 2024-11-11 eCollection Date: 2024-12-01 DOI:10.1007/s13755-024-00310-w
Zhenzhen Du, Shuang Wang, Ouzhou Yang, Juan He, Yujie Yang, Jing Zheng, Honglei Zhao, Yunpeng Cai
{"title":"Machine-learning-based prediction of cardiovascular events for hyperlipidemia population with lipid variability and remnant cholesterol as biomarkers.","authors":"Zhenzhen Du, Shuang Wang, Ouzhou Yang, Juan He, Yujie Yang, Jing Zheng, Honglei Zhao, Yunpeng Cai","doi":"10.1007/s13755-024-00310-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Dyslipidemia poses a significant risk for the progression to cardiovascular diseases. Despite the identification of numerous risk factors and the proposal of various risk scales, there is still an urgent need for effective predictive models for the onset of cardiovascular diseases in the hyperlipidemic population, which are essential for the prevention of CVD.</p><p><strong>Methods: </strong>We carried out a retrospective cohort study with 23,548 hyperlipidemia patients in Shenzhen Health Information Big Data Platform, including 11,723 CVD onset cases in a 3-year follow-up. The population was randomly divided into 70% as an independent training dataset and remaining 30% as test set. Four distinct machine-learning algorithms were implemented on the training dataset with the aim of developing highly accurate predictive models, and their performance was subsequently benchmarked against conventional risk assessment scales. An ablation study was also carried out to analyze the impact of individual risk factors to model performance.</p><p><strong>Results: </strong>The non-linear algorithm, LightGBM, excelled in forecasting the incidence of cardiovascular disease within 3 years, achieving an area under the 'receiver operating characteristic curve' (AUROC) of 0.883. This performance surpassed that of the conventional logistic regression model, which had an AUROC of 0.725, on identical datasets. Concurrently, in direct comparative analyses, machine-learning approaches have notably outperformed the three traditional risk assessment methods within their respective applicable populations. These include the Framingham cardiovascular disease risk score, 2019 ESC/EAS guidelines for the management of dyslipidemia and the 2016 Chinese recommendations for the management of dyslipidemia in adults. Further analysis of risk factors showed that the variability of blood lipid levels and remnant cholesterol played an important role in indicating an increased risk of CVD.</p><p><strong>Conclusions: </strong>We have shown that the application of machine-learning techniques significantly enhances the precision of cardiovascular risk forecasting among hyperlipidemic patients, addressing the critical issue of disease prediction's heterogeneity and non-linearity. Furthermore, some recently-suggested biomarkers, including blood lipid variability and remnant cholesterol are also important predictors of cardiovascular events, suggesting the importance of continuous lipid monitoring and healthcare profiling through big data platforms.</p>","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"51"},"PeriodicalIF":4.7000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11551092/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Information Science and Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13755-024-00310-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Dyslipidemia poses a significant risk for the progression to cardiovascular diseases. Despite the identification of numerous risk factors and the proposal of various risk scales, there is still an urgent need for effective predictive models for the onset of cardiovascular diseases in the hyperlipidemic population, which are essential for the prevention of CVD.

Methods: We carried out a retrospective cohort study with 23,548 hyperlipidemia patients in Shenzhen Health Information Big Data Platform, including 11,723 CVD onset cases in a 3-year follow-up. The population was randomly divided into 70% as an independent training dataset and remaining 30% as test set. Four distinct machine-learning algorithms were implemented on the training dataset with the aim of developing highly accurate predictive models, and their performance was subsequently benchmarked against conventional risk assessment scales. An ablation study was also carried out to analyze the impact of individual risk factors to model performance.

Results: The non-linear algorithm, LightGBM, excelled in forecasting the incidence of cardiovascular disease within 3 years, achieving an area under the 'receiver operating characteristic curve' (AUROC) of 0.883. This performance surpassed that of the conventional logistic regression model, which had an AUROC of 0.725, on identical datasets. Concurrently, in direct comparative analyses, machine-learning approaches have notably outperformed the three traditional risk assessment methods within their respective applicable populations. These include the Framingham cardiovascular disease risk score, 2019 ESC/EAS guidelines for the management of dyslipidemia and the 2016 Chinese recommendations for the management of dyslipidemia in adults. Further analysis of risk factors showed that the variability of blood lipid levels and remnant cholesterol played an important role in indicating an increased risk of CVD.

Conclusions: We have shown that the application of machine-learning techniques significantly enhances the precision of cardiovascular risk forecasting among hyperlipidemic patients, addressing the critical issue of disease prediction's heterogeneity and non-linearity. Furthermore, some recently-suggested biomarkers, including blood lipid variability and remnant cholesterol are also important predictors of cardiovascular events, suggesting the importance of continuous lipid monitoring and healthcare profiling through big data platforms.

基于机器学习的高脂血症人群心血管事件预测,以血脂变异性和残余胆固醇为生物标志物。
目的:血脂异常是引发心血管疾病的重要风险因素。尽管已经发现了许多危险因素,并提出了各种风险量表,但仍迫切需要建立有效的高脂血症人群心血管疾病发病预测模型,这对预防心血管疾病至关重要:我们在深圳市健康信息大数据平台上对 23,548 名高脂血症患者进行了回顾性队列研究,其中包括 3 年随访的 11,723 例心血管疾病发病病例。研究对象被随机分为70%作为独立的训练数据集,其余30%作为测试集。在训练数据集上实施了四种不同的机器学习算法,目的是开发出高度准确的预测模型,随后将其性能与传统的风险评估量表进行比较。此外,还进行了一项消融研究,以分析个别风险因素对模型性能的影响:结果:非线性算法 LightGBM 在预测 3 年内心血管疾病发病率方面表现出色,"接收者操作特征曲线 "下面积(AUROC)达到 0.883。在相同的数据集上,其性能超过了传统的逻辑回归模型,后者的接受者操作特征曲线下面积为 0.725。同时,在直接比较分析中,机器学习方法在各自适用人群中的表现明显优于三种传统风险评估方法。这些方法包括弗雷明汉心血管疾病风险评分、2019年ESC/EAS血脂异常管理指南和2016年中国成人血脂异常管理建议。对风险因素的进一步分析表明,血脂水平和残余胆固醇的变异性在表明心血管疾病风险增加方面起着重要作用:我们的研究表明,机器学习技术的应用大大提高了高脂血症患者心血管风险预测的准确性,解决了疾病预测的异质性和非线性这一关键问题。此外,最近提出的一些生物标志物,包括血脂变异性和残余胆固醇,也是心血管事件的重要预测指标,这表明通过大数据平台进行连续血脂监测和医疗保健分析的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.30
自引率
5.00%
发文量
30
期刊介绍: Health Information Science and Systems is a multidisciplinary journal that integrates artificial intelligence/computer science/information technology with health science and services, embracing information science research coupled with topics related to the modeling, design, development, integration and management of health information systems, smart health, artificial intelligence in medicine, and computer aided diagnosis, medical expert systems. The scope includes: i.) smart health, artificial Intelligence in medicine, computer aided diagnosis, medical image processing, medical expert systems ii.) medical big data, medical/health/biomedicine information resources such as patient medical records, devices and equipments, software and tools to capture, store, retrieve, process, analyze, optimize the use of information in the health domain, iii.) data management, data mining, and knowledge discovery, all of which play a key role in decision making, management of public health, examination of standards, privacy and security issues, iv.) development of new architectures and applications for health information systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信