比较机器学习模型在预测慢性肾病风险方面的性能

Journal of Archives in Military Medicine Pub Date : 2024-02-18 DOI:10.5812/jamm-140885

Sina Moosavi Kashani, Sanaz Zargar Balaye Jame

{"title":"比较机器学习模型在预测慢性肾病风险方面的性能","authors":"Sina Moosavi Kashani, Sanaz Zargar Balaye Jame","doi":"10.5812/jamm-140885","DOIUrl":null,"url":null,"abstract":"Background: Chronic kidney disease (CKD) poses a significant health burden worldwide, affecting approximately 10 - 15% of the global population. As one of the leading non-communicable diseases, CKD is a major cause of morbidity and mortality. Early identification of CKD is crucial for reducing its adverse effects on patient health. Prompt detection can significantly lessen the harmful consequences and enhance health outcomes for individuals with CKD. Objectives: This study aimed to evaluate and compare the effectiveness of various machine learning models in predicting the occurrence of CKD. Methods: The study involved the collection of data from a sample of 400 patients. We applied the well-established cross-industry standard process (CRISP) methodology for data mining to analyze the data. As part of this process, we efficiently handled missing data using the mode approach and addressed outliers through the interquartile range (IQR) method. We utilized sophisticated techniques, such as CatBoost (CB), random forest (RF), and artificial neural network (ANN) models to predict outcomes. For evaluation, we used the receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC). Results: An analysis of 400 patient records in this study identified that variables like serum creatinine, packed cell volume, specific gravity, and hemoglobin were most influential in predicting CKD. The results indicated that the CB and RF models surpassed the ANN in predicting the disease. Ten critical predictors were pinpointed for accurate disease prediction. Conclusions: The ensemble models in this study not only showcased remarkable speed but also demonstrated superior accuracy. These findings suggest the potential of ensemble models as an effective tool for enhancing predictive performance in similar studies.","PeriodicalId":15058,"journal":{"name":"Journal of Archives in Military Medicine","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing the Performance of Machine Learning Models in Predicting the Risk of Chronic Kidney Disease\",\"authors\":\"Sina Moosavi Kashani, Sanaz Zargar Balaye Jame\",\"doi\":\"10.5812/jamm-140885\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Chronic kidney disease (CKD) poses a significant health burden worldwide, affecting approximately 10 - 15% of the global population. As one of the leading non-communicable diseases, CKD is a major cause of morbidity and mortality. Early identification of CKD is crucial for reducing its adverse effects on patient health. Prompt detection can significantly lessen the harmful consequences and enhance health outcomes for individuals with CKD. Objectives: This study aimed to evaluate and compare the effectiveness of various machine learning models in predicting the occurrence of CKD. Methods: The study involved the collection of data from a sample of 400 patients. We applied the well-established cross-industry standard process (CRISP) methodology for data mining to analyze the data. As part of this process, we efficiently handled missing data using the mode approach and addressed outliers through the interquartile range (IQR) method. We utilized sophisticated techniques, such as CatBoost (CB), random forest (RF), and artificial neural network (ANN) models to predict outcomes. For evaluation, we used the receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC). Results: An analysis of 400 patient records in this study identified that variables like serum creatinine, packed cell volume, specific gravity, and hemoglobin were most influential in predicting CKD. The results indicated that the CB and RF models surpassed the ANN in predicting the disease. Ten critical predictors were pinpointed for accurate disease prediction. Conclusions: The ensemble models in this study not only showcased remarkable speed but also demonstrated superior accuracy. These findings suggest the potential of ensemble models as an effective tool for enhancing predictive performance in similar studies.\",\"PeriodicalId\":15058,\"journal\":{\"name\":\"Journal of Archives in Military Medicine\",\"volume\":\"2 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Archives in Military Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5812/jamm-140885\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Archives in Military Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5812/jamm-140885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

背景：慢性肾脏病（CKD）给全世界带来了巨大的健康负担，影响着全球约 10-15% 的人口。作为主要的非传染性疾病之一，慢性肾脏病是发病和死亡的主要原因。早期发现慢性肾功能衰竭对减少其对患者健康的不利影响至关重要。及时发现可大大减轻对慢性肾脏病患者的有害影响，并提高其健康状况。研究目的本研究旨在评估和比较各种机器学习模型在预测 CKD 发生方面的有效性。研究方法本研究收集了 400 名患者的样本数据。我们采用成熟的跨行业标准流程 (CRISP) 数据挖掘方法来分析数据。在这一过程中，我们使用模式法有效地处理了缺失数据，并通过四分位数间距 (IQR) 法处理了异常值。我们利用 CatBoost (CB)、随机森林 (RF) 和人工神经网络 (ANN) 模型等复杂技术来预测结果。在评估时，我们使用了接收者操作特征曲线（ROC），并计算了曲线下面积（AUC）。结果本研究对 400 份病历进行了分析，发现血清肌酐、全血细胞容积、比重和血红蛋白等变量对预测 CKD 的影响最大。结果表明，CB 和 RF 模型在预测疾病方面超过了 ANN。为准确预测疾病，确定了 10 个关键预测因子。结论本研究中的集合模型不仅速度惊人，而且准确性也很高。这些发现表明，在类似研究中，集合模型有可能成为提高预测性能的有效工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparing the Performance of Machine Learning Models in Predicting the Risk of Chronic Kidney Disease

Background: Chronic kidney disease (CKD) poses a significant health burden worldwide, affecting approximately 10 - 15% of the global population. As one of the leading non-communicable diseases, CKD is a major cause of morbidity and mortality. Early identification of CKD is crucial for reducing its adverse effects on patient health. Prompt detection can significantly lessen the harmful consequences and enhance health outcomes for individuals with CKD. Objectives: This study aimed to evaluate and compare the effectiveness of various machine learning models in predicting the occurrence of CKD. Methods: The study involved the collection of data from a sample of 400 patients. We applied the well-established cross-industry standard process (CRISP) methodology for data mining to analyze the data. As part of this process, we efficiently handled missing data using the mode approach and addressed outliers through the interquartile range (IQR) method. We utilized sophisticated techniques, such as CatBoost (CB), random forest (RF), and artificial neural network (ANN) models to predict outcomes. For evaluation, we used the receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC). Results: An analysis of 400 patient records in this study identified that variables like serum creatinine, packed cell volume, specific gravity, and hemoglobin were most influential in predicting CKD. The results indicated that the CB and RF models surpassed the ANN in predicting the disease. Ten critical predictors were pinpointed for accurate disease prediction. Conclusions: The ensemble models in this study not only showcased remarkable speed but also demonstrated superior accuracy. These findings suggest the potential of ensemble models as an effective tool for enhancing predictive performance in similar studies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Archives in Military Medicine

自引率

0.00%

发文量