使用机器学习算法预测心血管疾病的患病率

Bernada E. Sianga , Maurice C. Mbago , Amina S. Msengwa
{"title":"使用机器学习算法预测心血管疾病的患病率","authors":"Bernada E. Sianga ,&nbsp;Maurice C. Mbago ,&nbsp;Amina S. Msengwa","doi":"10.1016/j.ibmed.2025.100199","DOIUrl":null,"url":null,"abstract":"<div><div>Cardiovascular Diseases (CVDs) are the major cause of morbidity, disability, and mortality worldwide and are the most life-threatening diseases. Early detection and appropriate action can significantly reduce the effects and complications of CVD. Prediction of the likelihood that an individual can develop CVD adverse outcomes is essential. Machine learning methods are used to predict the risk of CVD incidences. Optimal model parameters were obtained using the grid search and randomized search methods. A hyperparameter tuning method with the highest accuracy was used to find the optimal parameters for the six algorithms used in this study. Two experiments were deployed: the first was training and testing the CVD dataset using hyperparameterized ML algorithms excluding geographical features, and the second included geographical features. The geographical features are air humidity, temperature and education status of a location. The performances of the two experiments were compared using classification metrics. The findings revealed that the performance of the second experiment outperformed the first experiment. XGBoost achieved the highest accuracy of 95.24 %, followed by the decision tree 93.87 % and support vector machine 92.87 % when geographical features were included (second experiment). Including geographical risk factors in predicting CVD is crucial as they contribute to the probability of developing CVD incidences.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100199"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting the prevalence of cardiovascular diseases using machine learning algorithms\",\"authors\":\"Bernada E. Sianga ,&nbsp;Maurice C. Mbago ,&nbsp;Amina S. Msengwa\",\"doi\":\"10.1016/j.ibmed.2025.100199\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cardiovascular Diseases (CVDs) are the major cause of morbidity, disability, and mortality worldwide and are the most life-threatening diseases. Early detection and appropriate action can significantly reduce the effects and complications of CVD. Prediction of the likelihood that an individual can develop CVD adverse outcomes is essential. Machine learning methods are used to predict the risk of CVD incidences. Optimal model parameters were obtained using the grid search and randomized search methods. A hyperparameter tuning method with the highest accuracy was used to find the optimal parameters for the six algorithms used in this study. Two experiments were deployed: the first was training and testing the CVD dataset using hyperparameterized ML algorithms excluding geographical features, and the second included geographical features. The geographical features are air humidity, temperature and education status of a location. The performances of the two experiments were compared using classification metrics. The findings revealed that the performance of the second experiment outperformed the first experiment. XGBoost achieved the highest accuracy of 95.24 %, followed by the decision tree 93.87 % and support vector machine 92.87 % when geographical features were included (second experiment). Including geographical risk factors in predicting CVD is crucial as they contribute to the probability of developing CVD incidences.</div></div>\",\"PeriodicalId\":73399,\"journal\":{\"name\":\"Intelligence-based medicine\",\"volume\":\"11 \",\"pages\":\"Article 100199\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligence-based medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S266652122500002X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266652122500002X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

心血管疾病(cvd)是世界范围内发病、残疾和死亡的主要原因,也是最危及生命的疾病。早期发现和适当的行动可以显著减少心血管疾病的影响和并发症。预测个体发生心血管疾病不良后果的可能性至关重要。机器学习方法用于预测心血管疾病发病率的风险。采用网格搜索和随机搜索方法获得最优模型参数。采用精度最高的超参数整定方法对六种算法进行了参数优化。部署了两个实验:第一个是使用排除地理特征的超参数化ML算法训练和测试CVD数据集,第二个是包含地理特征。地理特征是指一个地点的空气湿度、温度和教育状况。使用分类指标对两个实验的性能进行比较。结果显示,第二个实验的表现优于第一个实验。在包含地理特征时,XGBoost的准确率最高,为95.24%,其次是决策树(93.87%)和支持向量机(92.87%)(第二次实验)。在预测心血管疾病时包括地理危险因素是至关重要的,因为它们有助于心血管疾病发生的概率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Predicting the prevalence of cardiovascular diseases using machine learning algorithms
Cardiovascular Diseases (CVDs) are the major cause of morbidity, disability, and mortality worldwide and are the most life-threatening diseases. Early detection and appropriate action can significantly reduce the effects and complications of CVD. Prediction of the likelihood that an individual can develop CVD adverse outcomes is essential. Machine learning methods are used to predict the risk of CVD incidences. Optimal model parameters were obtained using the grid search and randomized search methods. A hyperparameter tuning method with the highest accuracy was used to find the optimal parameters for the six algorithms used in this study. Two experiments were deployed: the first was training and testing the CVD dataset using hyperparameterized ML algorithms excluding geographical features, and the second included geographical features. The geographical features are air humidity, temperature and education status of a location. The performances of the two experiments were compared using classification metrics. The findings revealed that the performance of the second experiment outperformed the first experiment. XGBoost achieved the highest accuracy of 95.24 %, followed by the decision tree 93.87 % and support vector machine 92.87 % when geographical features were included (second experiment). Including geographical risk factors in predicting CVD is crucial as they contribute to the probability of developing CVD incidences.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Intelligence-based medicine
Intelligence-based medicine Health Informatics
CiteScore
5.00
自引率
0.00%
发文量
0
审稿时长
187 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信