Neural network-based analysis of clinical and demographic variables for predicting platelet counts and prothrombin time in chronic kidney disease patients
{"title":"Neural network-based analysis of clinical and demographic variables for predicting platelet counts and prothrombin time in chronic kidney disease patients","authors":"Simin Nazari , Amira Abdelrasoul","doi":"10.1016/j.engappai.2025.111741","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>In chronic kidney disease (CKD), critical health indicators such as Platelet count, Prothrombin Time (PT), and depression play pivotal roles in patient management and outcome prediction. Platelets and PT are essential in assessing the coagulation status, which can be compromised in CKD, while depression significantly impacts the quality of life and treatment adherence. This study employs machine learning (ML) techniques to unravel the complexities of variable interactions in CKD and their collective influence on Platelet counts and PT levels. Specifically, we aim to leverage the capabilities of neural networks to dissect and understand how various patient characteristics—including biological sex, age, and pre-existing medical conditions—affect these crucial blood parameters. By doing so, we seek to provide clinically actionable insights that could support personalized monitoring and more effective risk stratification in CKD management.</div></div><div><h3>Data source and setting</h3><div>The data for this study were derived from an open-source dataset, originally utilized in the study by Li et al. and also published on Kaggle, an open data platform.</div></div><div><h3>Methods</h3><div>Data were sourced from the Medical Information Mart for Intensive Care (MIMIC-III) Database, comprising a cohort of 1177 patients for the analysis. The dataset was randomly split into a training set and a validation set.</div><div>We utilized a sequential neural network (NN) model trained using backpropagation to predict the levels of platelet count, PT, and depression indicators. Performance of the model was evaluated on the validation set, focusing on loss reduction over training iterations and accuracy improvement. Additionally, quantitative metrics such as Mean Squared Error (MSE) and Mean Absolute Error (MAE) were used to assess prediction accuracy.</div></div><div><h3>Results</h3><div>The NN model was trained to predict platelet count and PT levels, which are critical indicators in patients with chronic kidney disease. The model achieved an accuracy of 65.01 % for platelet count prediction (MSE = 0.0098) and 75.85 % for PT prediction (MSE = 0.0091), with the training loss rapidly decreasing to a plateau—indicating stable convergence. For platelet count, regression analysis identified blood acidity (pH) (β = 0.180, p < 0.001) and age (β = −0.162, p < 0.001) as the most significant predictors. In contrast, Shapley Additive Explanations (SHAP)-based feature importance analysis ranked pH and gender as the most impactful features, followed by hyperlipemia and age, highlighting the non-linear interactions captured by the neural network. Similarly, PT prediction was significantly influenced by hyperlipemia (β = 1.4162, p < 0.05) and hypertensive status (β = −1.5667, p < 0.05) in the regression model, and these two variables were also identified by SHAP as the most influential contributors. These findings underscore the complementary strengths of traditional statistical methods and explainable artificial intelligence (AI) techniques like SHAP in capturing both direct effects and complex, non-linear relationships, ultimately improving model interpretability and clinical applicability in predictive healthcare analytics.</div></div><div><h3>Conclusions</h3><div>The Backpropagation NN model utilized in this study represents a robust approach for analyzing and predicting key coagulation indicators in CKD patients. By identifying the most influential clinical variables driving platelet count and PT variation, this framework offers practical value in enhancing patient-specific risk assessment, monitoring strategies, and individualized treatment planning. Despite limitations related to dataset size, the model provides a foundation for clinically meaningful decision-support tools in nephrology.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111741"},"PeriodicalIF":7.5000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625017439","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
In chronic kidney disease (CKD), critical health indicators such as Platelet count, Prothrombin Time (PT), and depression play pivotal roles in patient management and outcome prediction. Platelets and PT are essential in assessing the coagulation status, which can be compromised in CKD, while depression significantly impacts the quality of life and treatment adherence. This study employs machine learning (ML) techniques to unravel the complexities of variable interactions in CKD and their collective influence on Platelet counts and PT levels. Specifically, we aim to leverage the capabilities of neural networks to dissect and understand how various patient characteristics—including biological sex, age, and pre-existing medical conditions—affect these crucial blood parameters. By doing so, we seek to provide clinically actionable insights that could support personalized monitoring and more effective risk stratification in CKD management.
Data source and setting
The data for this study were derived from an open-source dataset, originally utilized in the study by Li et al. and also published on Kaggle, an open data platform.
Methods
Data were sourced from the Medical Information Mart for Intensive Care (MIMIC-III) Database, comprising a cohort of 1177 patients for the analysis. The dataset was randomly split into a training set and a validation set.
We utilized a sequential neural network (NN) model trained using backpropagation to predict the levels of platelet count, PT, and depression indicators. Performance of the model was evaluated on the validation set, focusing on loss reduction over training iterations and accuracy improvement. Additionally, quantitative metrics such as Mean Squared Error (MSE) and Mean Absolute Error (MAE) were used to assess prediction accuracy.
Results
The NN model was trained to predict platelet count and PT levels, which are critical indicators in patients with chronic kidney disease. The model achieved an accuracy of 65.01 % for platelet count prediction (MSE = 0.0098) and 75.85 % for PT prediction (MSE = 0.0091), with the training loss rapidly decreasing to a plateau—indicating stable convergence. For platelet count, regression analysis identified blood acidity (pH) (β = 0.180, p < 0.001) and age (β = −0.162, p < 0.001) as the most significant predictors. In contrast, Shapley Additive Explanations (SHAP)-based feature importance analysis ranked pH and gender as the most impactful features, followed by hyperlipemia and age, highlighting the non-linear interactions captured by the neural network. Similarly, PT prediction was significantly influenced by hyperlipemia (β = 1.4162, p < 0.05) and hypertensive status (β = −1.5667, p < 0.05) in the regression model, and these two variables were also identified by SHAP as the most influential contributors. These findings underscore the complementary strengths of traditional statistical methods and explainable artificial intelligence (AI) techniques like SHAP in capturing both direct effects and complex, non-linear relationships, ultimately improving model interpretability and clinical applicability in predictive healthcare analytics.
Conclusions
The Backpropagation NN model utilized in this study represents a robust approach for analyzing and predicting key coagulation indicators in CKD patients. By identifying the most influential clinical variables driving platelet count and PT variation, this framework offers practical value in enhancing patient-specific risk assessment, monitoring strategies, and individualized treatment planning. Despite limitations related to dataset size, the model provides a foundation for clinically meaningful decision-support tools in nephrology.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.