{"title":"Advancing shock prediction: leveraging prior knowledge and self-controlled data for enhanced model accuracy and generalizability.","authors":"Cheng-Yu Tsai, Xiu-Rong Huang, Po-Tsun Kuo, Tzu-Tao Chen, Yun-Kai Yeh, Kuan-Yuan Chen, Arnab Majumdar, Chien-Hua Tseng","doi":"10.1186/s12911-025-03108-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Timely intervention in shock is vital, as delays over one hour greatly increase mortality. This study aims to develop an enhanced machine learning model that improves predictive performance by utilizing self-controlled data and applying feature engineering informed by medical knowledge to physiological waveforms, enabling the prediction of shock one hour in advance without relying on blood tests.</p><p><strong>Methods: </strong>Patient data and physiological waveforms were obtained from the Medical Information Mart for Intensive Care III (MIMIC-3) database. Shock was defined as a mean arterial pressure ≤ 65 mmHg for more than one minute, combined with serum lactate levels ≥ 2 mmol/L within 12 h before or after the hypotension event. Waveforms used for prediction were extracted from 30 min time-segment before a 1-hour period prior to the event. Self-controlled waveforms were obtained from the same patient either one day before or up to seven days after the shock event.</p><p><strong>Results: </strong>The study included 389 ICU patients who met the shock criteria and had complete physiological waveform data available for analysis. A total of 299 features were derived: 90 from arterial blood pressure (ABP), 89 from electrocardiogram (ECG), 112 from respiratory waveforms (RESP), and 8 from blood oxygen saturation (SpO<sub>2</sub>). The weighted ensemble model showed the best performance with an AUC of 0.93 and accuracy of 84.15%, and sensitivity of 79.64% in the testing set. The most predictive features included ECG_HRV_pNN50 (proportion of successive heartbeat intervals differing by more than 50 ms), RESP_Width_Mean (mean width of respiratory waveform), RESP_Cycle_Rate_Mean (mean respiratory cycle rate), ABP_TimeSBP2DBP_SampEn (sample entropy of systolic-diastolic intervals), and ABP_AmplitudeDBP_Median (median amplitude of diastolic peaks).</p><p><strong>Conclusions: </strong>This study demonstrated the feasibility of predicting shock one hour before its onset using only four physiological waveforms, combined with feature engineering based on physiological concepts and self-sampling data. The model achieved a strong AUC and a high sensitivity.</p><p><strong>Clinical trial number: </strong>Not applicable.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"262"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12261771/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03108-2","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Timely intervention in shock is vital, as delays over one hour greatly increase mortality. This study aims to develop an enhanced machine learning model that improves predictive performance by utilizing self-controlled data and applying feature engineering informed by medical knowledge to physiological waveforms, enabling the prediction of shock one hour in advance without relying on blood tests.
Methods: Patient data and physiological waveforms were obtained from the Medical Information Mart for Intensive Care III (MIMIC-3) database. Shock was defined as a mean arterial pressure ≤ 65 mmHg for more than one minute, combined with serum lactate levels ≥ 2 mmol/L within 12 h before or after the hypotension event. Waveforms used for prediction were extracted from 30 min time-segment before a 1-hour period prior to the event. Self-controlled waveforms were obtained from the same patient either one day before or up to seven days after the shock event.
Results: The study included 389 ICU patients who met the shock criteria and had complete physiological waveform data available for analysis. A total of 299 features were derived: 90 from arterial blood pressure (ABP), 89 from electrocardiogram (ECG), 112 from respiratory waveforms (RESP), and 8 from blood oxygen saturation (SpO2). The weighted ensemble model showed the best performance with an AUC of 0.93 and accuracy of 84.15%, and sensitivity of 79.64% in the testing set. The most predictive features included ECG_HRV_pNN50 (proportion of successive heartbeat intervals differing by more than 50 ms), RESP_Width_Mean (mean width of respiratory waveform), RESP_Cycle_Rate_Mean (mean respiratory cycle rate), ABP_TimeSBP2DBP_SampEn (sample entropy of systolic-diastolic intervals), and ABP_AmplitudeDBP_Median (median amplitude of diastolic peaks).
Conclusions: This study demonstrated the feasibility of predicting shock one hour before its onset using only four physiological waveforms, combined with feature engineering based on physiological concepts and self-sampling data. The model achieved a strong AUC and a high sensitivity.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.