{"title":"Optimizing Vital Signs in Patients With Traumatic Brain Injury: Reinforcement Learning Algorithm Development and Validation.","authors":"Hongwei Zhang, Mengyuan Diao, Sheng Zhang, Peifeng Ni, Weidong Zhang, Chenxi Wu, Ying Zhu, Wei Hu","doi":"10.2196/63847","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Traumatic brain injury (TBI) is a critically ill disease with a high mortality rate, and clinical treatment is committed to continuously optimizing treatment strategies to improve survival rates.</p><p><strong>Objective: </strong>This study aims to establish a reinforcement learning algorithm (RL) to optimize the survival prognosis decision-making scheme for patients with TBI in the intensive care unit.</p><p><strong>Methods: </strong>We included a total of 2745 patients from the Medical Information Mart for Intensive Care (MIMIC)-IV database and randomly divided them into a training set and an internal validation set at 8:2. We extracted 34 features for analysis and modeling using a 2-hour time compensation, 2 action features (mean arterial pressure and temperature), and 1 outcome feature (survival status at 28 d). We used an RL algorithm called weighted dueling double deep Q-network with embedded human expertise to maximize cumulative returns and evaluated the model using a doubly robust off-policy evaluation method. Finally, we collected 2463 patients with TBI from MIMIC III as an external validation set to test the model.</p><p><strong>Results: </strong>The action features are divided into 6 intervals, and the expected benefits are estimated using a doubly robust off-policy evaluation method. The results indicate that the survival rate of artificial intelligence (AI) strategies is higher than that of clinical doctors (88.016%, 95% CI 85.191%-90.840% vs 81.094%, 95% CI 80.422%-81.765%), with an expected return of 28.978 (95% CI 28.797-29.160) versus 27.092 (95% CI 24.584-29.600). Compared with clinical doctors, AI algorithms select normal temperatures more frequently (36.56 °C to 36.83 ℃) and recommend mean arterial pressure levels of 87.5-95.0 mm Hg. In external validation, the AI strategy still has a high survival rate of 87.565%, with an expected return of 27.517.</p><p><strong>Conclusions: </strong>This RL algorithm for patients with TBI indicates that a more personalized and targeted optimization of the vital signs is possible. This algorithm will assist clinicians in making decisions on an individualized patient-by-patient basis.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e63847"},"PeriodicalIF":5.8000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12244269/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/63847","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Traumatic brain injury (TBI) is a critically ill disease with a high mortality rate, and clinical treatment is committed to continuously optimizing treatment strategies to improve survival rates.
Objective: This study aims to establish a reinforcement learning algorithm (RL) to optimize the survival prognosis decision-making scheme for patients with TBI in the intensive care unit.
Methods: We included a total of 2745 patients from the Medical Information Mart for Intensive Care (MIMIC)-IV database and randomly divided them into a training set and an internal validation set at 8:2. We extracted 34 features for analysis and modeling using a 2-hour time compensation, 2 action features (mean arterial pressure and temperature), and 1 outcome feature (survival status at 28 d). We used an RL algorithm called weighted dueling double deep Q-network with embedded human expertise to maximize cumulative returns and evaluated the model using a doubly robust off-policy evaluation method. Finally, we collected 2463 patients with TBI from MIMIC III as an external validation set to test the model.
Results: The action features are divided into 6 intervals, and the expected benefits are estimated using a doubly robust off-policy evaluation method. The results indicate that the survival rate of artificial intelligence (AI) strategies is higher than that of clinical doctors (88.016%, 95% CI 85.191%-90.840% vs 81.094%, 95% CI 80.422%-81.765%), with an expected return of 28.978 (95% CI 28.797-29.160) versus 27.092 (95% CI 24.584-29.600). Compared with clinical doctors, AI algorithms select normal temperatures more frequently (36.56 °C to 36.83 ℃) and recommend mean arterial pressure levels of 87.5-95.0 mm Hg. In external validation, the AI strategy still has a high survival rate of 87.565%, with an expected return of 27.517.
Conclusions: This RL algorithm for patients with TBI indicates that a more personalized and targeted optimization of the vital signs is possible. This algorithm will assist clinicians in making decisions on an individualized patient-by-patient basis.
背景:创伤性脑损伤(Traumatic brain injury, TBI)是一种高死亡率的危重疾病,临床治疗致力于不断优化治疗策略以提高生存率。目的:本研究旨在建立强化学习算法(RL)来优化重症监护室TBI患者的生存预后决策方案。方法:我们从重症监护医疗信息市场(MIMIC)-IV数据库中共纳入2745例患者,并在8:2随机分为训练集和内部验证集。我们提取了34个特征用于分析和建模,使用2小时时间补偿,2个动作特征(平均动脉压和温度)和1个结局特征(28 d时的生存状态)。我们使用了一种名为加权决斗双深度q网络的强化学习算法,该算法嵌入了人类专业知识,以最大化累积回报,并使用双重鲁棒的非策略评估方法来评估模型。最后,我们从MIMIC III中收集了2463例TBI患者作为外部验证集来测试模型。结果:将动作特征划分为6个区间,并使用双鲁棒off-policy评估方法估计预期收益。结果表明,人工智能(AI)策略的生存率高于临床医生(88.016%,95% CI 85.191%-90.840% vs 81.094%, 95% CI 80.422%-81.765%),预期收益率为28.978 (95% CI 28.797-29.160) vs 27.092 (95% CI 24.584-29.600)。与临床医生相比,人工智能算法选择常温(36.56℃~ 36.83℃)的频率更高,推荐的平均动脉压水平为87.5 ~ 95.0 mm Hg。在外部验证中,人工智能策略仍然具有87.565%的高存活率,预期回报率为27.517。结论:TBI患者的RL算法表明更个性化和有针对性的生命体征优化是可能的。该算法将帮助临床医生在个体化的病人基础上做出决定。
期刊介绍:
The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades.
As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor.
Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.