Machine learning for classifying chronic kidney disease and predicting creatinine levels using at-home measurements

Brady Metherall, Anna K. Berryman, Georgia S. Brennan
{"title":"Machine learning for classifying chronic kidney disease and predicting creatinine levels using at-home measurements","authors":"Brady Metherall, Anna K. Berryman, Georgia S. Brennan","doi":"10.1101/2024.03.15.24304364","DOIUrl":null,"url":null,"abstract":"Background: Chronic kidney disease (CKD) is a global health concern with early detection playing a pivotal role in effective management. Machine learning models demonstrate promise in CKD detection, yet the impact on detection and classification using different sets of clinical features remains under-explored.\nMethods: In this study, we focus on CKD classification and creatinine prediction using three sets of features; at-home, monitoring, and laboratory. We employ artificial neural networks (ANNs) and random forests (RFs) on a dataset of 400 patients with 25 input features, which we divide into three feature sets. Using 10-fold cross-validation, we calculate metrics such as accuracy, true positive rate (TPR), true negative rate (TNR), and mean squared error.\nResults: Our results reveal RF achieves superior accuracy (92.5\\%) in at-home CKD classification over ANNs (82.9\\%). ANNs achieve a higher TPR (92.0\\%) but a lower TNR (67.9\\%) compared with RFs (90.0\\% and 95.8\\%, respectively). For monitoring and laboratory features, both methods achieve accuracies exceeding 98\\%. The R2 score for creatinine regression is approximately 0.3 higher with laboratory features than at-home features. Feature importance analysis identifies key clinical variables hemoglobin and blood urea, and key comorbidities hypertension and diabetes mellitus, in agreement with previous studies.\nConclusions: Machine learning models, particularly RFs, exhibit promise in CKD diagnosis and highlight significant features in CKD detection. Moreover, such models may assist in screening a general population using at-home features---potentially increasing early detection of CKD, thus improving patient care and offering hope for a more effective approach to managing this prevalent health condition.","PeriodicalId":501513,"journal":{"name":"medRxiv - Nephrology","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Nephrology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.03.15.24304364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Chronic kidney disease (CKD) is a global health concern with early detection playing a pivotal role in effective management. Machine learning models demonstrate promise in CKD detection, yet the impact on detection and classification using different sets of clinical features remains under-explored. Methods: In this study, we focus on CKD classification and creatinine prediction using three sets of features; at-home, monitoring, and laboratory. We employ artificial neural networks (ANNs) and random forests (RFs) on a dataset of 400 patients with 25 input features, which we divide into three feature sets. Using 10-fold cross-validation, we calculate metrics such as accuracy, true positive rate (TPR), true negative rate (TNR), and mean squared error. Results: Our results reveal RF achieves superior accuracy (92.5\%) in at-home CKD classification over ANNs (82.9\%). ANNs achieve a higher TPR (92.0\%) but a lower TNR (67.9\%) compared with RFs (90.0\% and 95.8\%, respectively). For monitoring and laboratory features, both methods achieve accuracies exceeding 98\%. The R2 score for creatinine regression is approximately 0.3 higher with laboratory features than at-home features. Feature importance analysis identifies key clinical variables hemoglobin and blood urea, and key comorbidities hypertension and diabetes mellitus, in agreement with previous studies. Conclusions: Machine learning models, particularly RFs, exhibit promise in CKD diagnosis and highlight significant features in CKD detection. Moreover, such models may assist in screening a general population using at-home features---potentially increasing early detection of CKD, thus improving patient care and offering hope for a more effective approach to managing this prevalent health condition.
利用机器学习对慢性肾病进行分类,并利用居家测量结果预测肌酐水平
背景:慢性肾脏病(CKD)是全球关注的健康问题,早期检测在有效管理中发挥着关键作用。机器学习模型在 CKD 检测中大有可为,但使用不同临床特征集对检测和分类的影响仍未得到充分探讨:在本研究中,我们重点研究了使用三组特征进行 CKD 分类和肌酐预测的情况;这三组特征分别是居家特征、监测特征和实验室特征。我们采用人工神经网络(ANN)和随机森林(RF)对 400 名患者的数据集进行分析,这些数据集有 25 个输入特征,我们将其分为三个特征集。通过 10 倍交叉验证,我们计算了准确率、真阳性率(TPR)、真阴性率(TNR)和均方误差等指标:我们的结果表明,在家庭 CKD 分类中,RF 的准确率(92.5%)高于 ANNs(82.9%)。与 RF(分别为 90.0\% 和 95.8\%)相比,ANN 的 TPR(92.0\%)更高,但 TNR(67.9\%)更低。在监测和实验室特征方面,两种方法的准确率都超过了98%。实验室特征的肌酐回归 R2 分数比居家特征高出约 0.3。特征重要性分析确定了关键临床变量血红蛋白和血尿素,以及关键合并症高血压和糖尿病,这与之前的研究结果一致:机器学习模型,尤其是射频模型,在诊断慢性肾脏病方面大有可为,并能突出慢性肾脏病检测中的重要特征。此外,这些模型还可以帮助利用家庭特征对普通人群进行筛查--有可能增加对慢性肾脏病的早期检测,从而改善患者护理,并为更有效地管理这种普遍存在的健康问题带来希望。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信