Brady Metherall, Anna K. Berryman, Georgia S. Brennan
{"title":"Machine learning for classifying chronic kidney disease and predicting creatinine levels using at-home measurements","authors":"Brady Metherall, Anna K. Berryman, Georgia S. Brennan","doi":"10.1101/2024.03.15.24304364","DOIUrl":null,"url":null,"abstract":"Background: Chronic kidney disease (CKD) is a global health concern with early detection playing a pivotal role in effective management. Machine learning models demonstrate promise in CKD detection, yet the impact on detection and classification using different sets of clinical features remains under-explored.\nMethods: In this study, we focus on CKD classification and creatinine prediction using three sets of features; at-home, monitoring, and laboratory. We employ artificial neural networks (ANNs) and random forests (RFs) on a dataset of 400 patients with 25 input features, which we divide into three feature sets. Using 10-fold cross-validation, we calculate metrics such as accuracy, true positive rate (TPR), true negative rate (TNR), and mean squared error.\nResults: Our results reveal RF achieves superior accuracy (92.5\\%) in at-home CKD classification over ANNs (82.9\\%). ANNs achieve a higher TPR (92.0\\%) but a lower TNR (67.9\\%) compared with RFs (90.0\\% and 95.8\\%, respectively). For monitoring and laboratory features, both methods achieve accuracies exceeding 98\\%. The R2 score for creatinine regression is approximately 0.3 higher with laboratory features than at-home features. Feature importance analysis identifies key clinical variables hemoglobin and blood urea, and key comorbidities hypertension and diabetes mellitus, in agreement with previous studies.\nConclusions: Machine learning models, particularly RFs, exhibit promise in CKD diagnosis and highlight significant features in CKD detection. Moreover, such models may assist in screening a general population using at-home features---potentially increasing early detection of CKD, thus improving patient care and offering hope for a more effective approach to managing this prevalent health condition.","PeriodicalId":501513,"journal":{"name":"medRxiv - Nephrology","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Nephrology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.03.15.24304364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Chronic kidney disease (CKD) is a global health concern with early detection playing a pivotal role in effective management. Machine learning models demonstrate promise in CKD detection, yet the impact on detection and classification using different sets of clinical features remains under-explored.
Methods: In this study, we focus on CKD classification and creatinine prediction using three sets of features; at-home, monitoring, and laboratory. We employ artificial neural networks (ANNs) and random forests (RFs) on a dataset of 400 patients with 25 input features, which we divide into three feature sets. Using 10-fold cross-validation, we calculate metrics such as accuracy, true positive rate (TPR), true negative rate (TNR), and mean squared error.
Results: Our results reveal RF achieves superior accuracy (92.5\%) in at-home CKD classification over ANNs (82.9\%). ANNs achieve a higher TPR (92.0\%) but a lower TNR (67.9\%) compared with RFs (90.0\% and 95.8\%, respectively). For monitoring and laboratory features, both methods achieve accuracies exceeding 98\%. The R2 score for creatinine regression is approximately 0.3 higher with laboratory features than at-home features. Feature importance analysis identifies key clinical variables hemoglobin and blood urea, and key comorbidities hypertension and diabetes mellitus, in agreement with previous studies.
Conclusions: Machine learning models, particularly RFs, exhibit promise in CKD diagnosis and highlight significant features in CKD detection. Moreover, such models may assist in screening a general population using at-home features---potentially increasing early detection of CKD, thus improving patient care and offering hope for a more effective approach to managing this prevalent health condition.