Nan Cheng , Zian Yi , Jiayue Wang , Zhenliang Hui , Jun Chen , An Gao
{"title":"Initial seizure episodes risk factors identification during hospitalization of ICU patients: A retrospective analysis of the eICU collaborative research database","authors":"Nan Cheng , Zian Yi , Jiayue Wang , Zhenliang Hui , Jun Chen , An Gao","doi":"10.1016/j.jocn.2025.111266","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>We aimed to identify risk factors for initial seizure episodes in ICU patients using various machine learning algorithms.</div></div><div><h3>Methods</h3><div>Using the extensive eICU database, we curated a dataset of 200,859 patient records, with 15,890 patients meeting inclusion and exclusion criteria. Among them, 497 experienced initial seizure episodes during hospitalization. We developed models to identify risk factors associated with these episodes using Logistic Regression, Random Forest, Gradient Boosting, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree. After developing and evaluating these individual models, we selected the two best-performing models and combined them using a stacking ensemble learning technique. Additionally, Recursive Feature Elimination (RFE) was used to select the most relevant features. Model performance was evaluated using metrics such as Area Under the Receiver Operating Characteristic Curve (AUC-ROC), accuracy, precision, recall, and F1 score, alongside calibration plots and Decision Curve Analysis (DCA).</div></div><div><h3>Results</h3><div>The incidence rate of initial seizure episodes was 3.10% (497/15,890), with no significant difference between the training and validation sets. The best-performing individual models were Gradient Boosting (AUC-ROC: 0.78) and Logistic Regression (AUC-ROC: 0.79). The ensemble model achieved an AUC-ROC of 0.80 (95%CI: 0.78–0.82), accuracy of 0.78, precision of 0.80, recall of 0.75, and F1 score of 0.77. Calibration plots demonstrated that the ensemble model’s predicted probabilities were well-aligned with observed outcomes. DCA indicated significant net benefit across a range of threshold probabilities, underscoring the model’s clinical utility.</div></div><div><h3>Conclusion</h3><div>The ensemble learning model, combining Gradient Boosting and Logistic Regression via a stacking technique, demonstrated superior performance for identifying risk factors for initial seizure episodes in ICU patients. This model was evaluated using a range of performance metrics, including accuracy, sensitivity, specificity, and the AUC-ROC curve, and was validated through 10-fold cross-validation to ensure its robustness and generalizability. These results offer clinically relevant risk factor identification. Key risk factors identified include age, GCS score, glucose levels, hematocrit levels, hyponatremia, stroke history, prothrombin time, potassium levels, and hypertension. The risk estimation table simplifies these complex interactions into a practical tool for clinical use.</div></div>","PeriodicalId":15487,"journal":{"name":"Journal of Clinical Neuroscience","volume":"136 ","pages":"Article 111266"},"PeriodicalIF":1.9000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0967586825002383","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
Initial seizure episodes risk factors identification during hospitalization of ICU patients: A retrospective analysis of the eICU collaborative research database
Background
We aimed to identify risk factors for initial seizure episodes in ICU patients using various machine learning algorithms.
Methods
Using the extensive eICU database, we curated a dataset of 200,859 patient records, with 15,890 patients meeting inclusion and exclusion criteria. Among them, 497 experienced initial seizure episodes during hospitalization. We developed models to identify risk factors associated with these episodes using Logistic Regression, Random Forest, Gradient Boosting, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree. After developing and evaluating these individual models, we selected the two best-performing models and combined them using a stacking ensemble learning technique. Additionally, Recursive Feature Elimination (RFE) was used to select the most relevant features. Model performance was evaluated using metrics such as Area Under the Receiver Operating Characteristic Curve (AUC-ROC), accuracy, precision, recall, and F1 score, alongside calibration plots and Decision Curve Analysis (DCA).
Results
The incidence rate of initial seizure episodes was 3.10% (497/15,890), with no significant difference between the training and validation sets. The best-performing individual models were Gradient Boosting (AUC-ROC: 0.78) and Logistic Regression (AUC-ROC: 0.79). The ensemble model achieved an AUC-ROC of 0.80 (95%CI: 0.78–0.82), accuracy of 0.78, precision of 0.80, recall of 0.75, and F1 score of 0.77. Calibration plots demonstrated that the ensemble model’s predicted probabilities were well-aligned with observed outcomes. DCA indicated significant net benefit across a range of threshold probabilities, underscoring the model’s clinical utility.
Conclusion
The ensemble learning model, combining Gradient Boosting and Logistic Regression via a stacking technique, demonstrated superior performance for identifying risk factors for initial seizure episodes in ICU patients. This model was evaluated using a range of performance metrics, including accuracy, sensitivity, specificity, and the AUC-ROC curve, and was validated through 10-fold cross-validation to ensure its robustness and generalizability. These results offer clinically relevant risk factor identification. Key risk factors identified include age, GCS score, glucose levels, hematocrit levels, hyponatremia, stroke history, prothrombin time, potassium levels, and hypertension. The risk estimation table simplifies these complex interactions into a practical tool for clinical use.
期刊介绍:
This International journal, Journal of Clinical Neuroscience, publishes articles on clinical neurosurgery and neurology and the related neurosciences such as neuro-pathology, neuro-radiology, neuro-ophthalmology and neuro-physiology.
The journal has a broad International perspective, and emphasises the advances occurring in Asia, the Pacific Rim region, Europe and North America. The Journal acts as a focus for publication of major clinical and laboratory research, as well as publishing solicited manuscripts on specific subjects from experts, case reports and other information of interest to clinicians working in the clinical neurosciences.