Optimizing Vital Signs in Patients With Traumatic Brain Injury: Reinforcement Learning Algorithm Development and Validation.

IF 5.8 2区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Journal of Medical Internet Research Pub Date : 2025-07-03 DOI:10.2196/63847

Hongwei Zhang, Mengyuan Diao, Sheng Zhang, Peifeng Ni, Weidong Zhang, Chenxi Wu, Ying Zhu, Wei Hu

{"title":"Optimizing Vital Signs in Patients With Traumatic Brain Injury: Reinforcement Learning Algorithm Development and Validation.","authors":"Hongwei Zhang, Mengyuan Diao, Sheng Zhang, Peifeng Ni, Weidong Zhang, Chenxi Wu, Ying Zhu, Wei Hu","doi":"10.2196/63847","DOIUrl":null,"url":null,"abstract":"Background: Traumatic brain injury (TBI) is a critically ill disease with a high mortality rate, and clinical treatment is committed to continuously optimizing treatment strategies to improve survival rates.Objective: This study aims to establish a reinforcement learning algorithm (RL) to optimize the survival prognosis decision-making scheme for patients with TBI in the intensive care unit.Methods: We included a total of 2745 patients from the Medical Information Mart for Intensive Care (MIMIC)-IV database and randomly divided them into a training set and an internal validation set at 8:2. We extracted 34 features for analysis and modeling using a 2-hour time compensation, 2 action features (mean arterial pressure and temperature), and 1 outcome feature (survival status at 28 d). We used an RL algorithm called weighted dueling double deep Q-network with embedded human expertise to maximize cumulative returns and evaluated the model using a doubly robust off-policy evaluation method. Finally, we collected 2463 patients with TBI from MIMIC III as an external validation set to test the model.Results: The action features are divided into 6 intervals, and the expected benefits are estimated using a doubly robust off-policy evaluation method. The results indicate that the survival rate of artificial intelligence (AI) strategies is higher than that of clinical doctors (88.016%, 95% CI 85.191%-90.840% vs 81.094%, 95% CI 80.422%-81.765%), with an expected return of 28.978 (95% CI 28.797-29.160) versus 27.092 (95% CI 24.584-29.600). Compared with clinical doctors, AI algorithms select normal temperatures more frequently (36.56 °C to 36.83 ℃) and recommend mean arterial pressure levels of 87.5-95.0 mm Hg. In external validation, the AI strategy still has a high survival rate of 87.565%, with an expected return of 27.517.Conclusions: This RL algorithm for patients with TBI indicates that a more personalized and targeted optimization of the vital signs is possible. This algorithm will assist clinicians in making decisions on an individualized patient-by-patient basis.","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e63847"},"PeriodicalIF":5.8000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12244269/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/63847","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Traumatic brain injury (TBI) is a critically ill disease with a high mortality rate, and clinical treatment is committed to continuously optimizing treatment strategies to improve survival rates.

Objective: This study aims to establish a reinforcement learning algorithm (RL) to optimize the survival prognosis decision-making scheme for patients with TBI in the intensive care unit.

Methods: We included a total of 2745 patients from the Medical Information Mart for Intensive Care (MIMIC)-IV database and randomly divided them into a training set and an internal validation set at 8:2. We extracted 34 features for analysis and modeling using a 2-hour time compensation, 2 action features (mean arterial pressure and temperature), and 1 outcome feature (survival status at 28 d). We used an RL algorithm called weighted dueling double deep Q-network with embedded human expertise to maximize cumulative returns and evaluated the model using a doubly robust off-policy evaluation method. Finally, we collected 2463 patients with TBI from MIMIC III as an external validation set to test the model.

Results: The action features are divided into 6 intervals, and the expected benefits are estimated using a doubly robust off-policy evaluation method. The results indicate that the survival rate of artificial intelligence (AI) strategies is higher than that of clinical doctors (88.016%, 95% CI 85.191%-90.840% vs 81.094%, 95% CI 80.422%-81.765%), with an expected return of 28.978 (95% CI 28.797-29.160) versus 27.092 (95% CI 24.584-29.600). Compared with clinical doctors, AI algorithms select normal temperatures more frequently (36.56 °C to 36.83 ℃) and recommend mean arterial pressure levels of 87.5-95.0 mm Hg. In external validation, the AI strategy still has a high survival rate of 87.565%, with an expected return of 27.517.

Conclusions: This RL algorithm for patients with TBI indicates that a more personalized and targeted optimization of the vital signs is possible. This algorithm will assist clinicians in making decisions on an individualized patient-by-patient basis.

查看原文本刊更多论文

优化创伤性脑损伤患者的生命体征：强化学习算法的开发与验证。

背景：创伤性脑损伤（Traumatic brain injury， TBI）是一种高死亡率的危重疾病，临床治疗致力于不断优化治疗策略以提高生存率。目的：本研究旨在建立强化学习算法（RL）来优化重症监护室TBI患者的生存预后决策方案。方法：我们从重症监护医疗信息市场(MIMIC)-IV数据库中共纳入2745例患者，并在8:2随机分为训练集和内部验证集。我们提取了34个特征用于分析和建模，使用2小时时间补偿，2个动作特征（平均动脉压和温度）和1个结局特征（28 d时的生存状态）。我们使用了一种名为加权决斗双深度q网络的强化学习算法，该算法嵌入了人类专业知识，以最大化累积回报，并使用双重鲁棒的非策略评估方法来评估模型。最后，我们从MIMIC III中收集了2463例TBI患者作为外部验证集来测试模型。结果：将动作特征划分为6个区间，并使用双鲁棒off-policy评估方法估计预期收益。结果表明，人工智能（AI）策略的生存率高于临床医生（88.016%,95% CI 85.191%-90.840% vs 81.094%, 95% CI 80.422%-81.765%），预期收益率为28.978 (95% CI 28.797-29.160) vs 27.092 （95% CI 24.584-29.600）。与临床医生相比，人工智能算法选择常温（36.56℃~ 36.83℃）的频率更高，推荐的平均动脉压水平为87.5 ~ 95.0 mm Hg。在外部验证中，人工智能策略仍然具有87.565%的高存活率，预期回报率为27.517。结论：TBI患者的RL算法表明更个性化和有针对性的生命体征优化是可能的。该算法将帮助临床医生在个体化的病人基础上做出决定。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Medical Internet Research 医学-卫生保健

CiteScore

14.40

自引率

5.40%

发文量

654

审稿时长

1 months

期刊介绍： The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades. As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor. Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.