{"title":"用于预测接受异丙酚镇静的重症患者高甘油三酯血症风险的机器学习工具的开发和验证方案。","authors":"Jiawen Deng, Kiyan Heybati, Hemang Yadav","doi":"10.1101/2024.08.17.24312159","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Propofol is a widely used sedative-hypnotic agent for critically-ill patients requiring invasive mechanical ventilation (IMV). Despite its clinical benefits, propofol is associated with increased risks of hypertriglyceridemia. Early identification of patients at risk for propofol-associated hypertriglyceridemia is crucial for optimizing sedation strategies and preventing adverse outcomes. Machine learning (ML) models offer a promising approach for predicting individualized patient risks of propofol-associated hypertriglyceridemia.</p><p><strong>Methods and analysis: </strong>We propose the development of a ML model aimed at predicting the risk of propofol-associated hypertriglyceridemia in ICU patients receiving IMV. The study will utilize retrospective data from four Mayo Clinic sites. Nested cross-validation (CV) will be employed, with a 10-fold inner CV loop for model tuning and selection as well as an outer loop using leave-one-site-out CV for external validation. Feature selection will be conducted using Boruta and LASSO-penalized logistic regression. Data preprocessing steps include missing data imputation, feature scaling, and dimensionality reduction techniques. Six ML algorithms will be tuned and evaluated. Bayesian optimization will be used for hyperparameter selection. Global model explainability will be assessed using permutation importance, and local model explainability will be assessed using SHapley Additive exPlanations (SHAP).</p><p><strong>Ethics and dissemination: </strong>The proposed ML model aims to provide a reliable and interpretable tool for clinicians to predict the risk of propofol-associated hypertriglyceridemia in ICU patients. The final model will be deployed in a web-based clinical risk calculator. The model development process and performance measures obtained during nested cross-validation will be described in a study publication to be disseminated in a peer-reviewed journal. The proposed study has received ethics approval from the Mayo Clinic Institutional Review Board (IRB #23-007416).</p><p><strong>Strengths and limitations of this study: </strong>Robust external validation using a nested cross-validation (CV) framework will help assess the generalizability of models produced from the modeling pipeline across different hospital settings.A diverse set of machine learning (ML) algorithms and advanced hyperparameter tuning techniques will be employed to identify the most optimal model configuration.Integration of feature explainability will enhance the clinical applicability of the ML models by providing transparency in predictions, which can improve clinician trust and encourage adoption.Reliance on retrospective data may introduce biases due to inconsistent or erroneous data collection, and the computational intensity of the validation approach may limit replication and future model expansion in resource-constrained settings.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370510/pdf/","citationCount":"0","resultStr":"{\"title\":\"Protocol for the development and validation of machine-learning models for predicting the risk of hypertriglyceridemia in critically ill patients receiving propofol sedation using retrospective data.\",\"authors\":\"Jiawen Deng, Kiyan Heybati, Hemang Yadav\",\"doi\":\"10.1101/2024.08.17.24312159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Propofol is a widely used sedative-hypnotic agent for critically-ill patients requiring invasive mechanical ventilation (IMV). Despite its clinical benefits, propofol is associated with increased risks of hypertriglyceridemia. Early identification of patients at risk for propofol-associated hypertriglyceridemia is crucial for optimizing sedation strategies and preventing adverse outcomes. Machine learning (ML) models offer a promising approach for predicting individualized patient risks of propofol-associated hypertriglyceridemia.</p><p><strong>Methods and analysis: </strong>We propose the development of a ML model aimed at predicting the risk of propofol-associated hypertriglyceridemia in ICU patients receiving IMV. The study will utilize retrospective data from four Mayo Clinic sites. Nested cross-validation (CV) will be employed, with a 10-fold inner CV loop for model tuning and selection as well as an outer loop using leave-one-site-out CV for external validation. Feature selection will be conducted using Boruta and LASSO-penalized logistic regression. Data preprocessing steps include missing data imputation, feature scaling, and dimensionality reduction techniques. Six ML algorithms will be tuned and evaluated. Bayesian optimization will be used for hyperparameter selection. Global model explainability will be assessed using permutation importance, and local model explainability will be assessed using SHapley Additive exPlanations (SHAP).</p><p><strong>Ethics and dissemination: </strong>The proposed ML model aims to provide a reliable and interpretable tool for clinicians to predict the risk of propofol-associated hypertriglyceridemia in ICU patients. The final model will be deployed in a web-based clinical risk calculator. The model development process and performance measures obtained during nested cross-validation will be described in a study publication to be disseminated in a peer-reviewed journal. The proposed study has received ethics approval from the Mayo Clinic Institutional Review Board (IRB #23-007416).</p><p><strong>Strengths and limitations of this study: </strong>Robust external validation using a nested cross-validation (CV) framework will help assess the generalizability of models produced from the modeling pipeline across different hospital settings.A diverse set of machine learning (ML) algorithms and advanced hyperparameter tuning techniques will be employed to identify the most optimal model configuration.Integration of feature explainability will enhance the clinical applicability of the ML models by providing transparency in predictions, which can improve clinician trust and encourage adoption.Reliance on retrospective data may introduce biases due to inconsistent or erroneous data collection, and the computational intensity of the validation approach may limit replication and future model expansion in resource-constrained settings.</p>\",\"PeriodicalId\":94281,\"journal\":{\"name\":\"medRxiv : the preprint server for health sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370510/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv : the preprint server for health sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.08.17.24312159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.17.24312159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
导言:异丙酚是一种广泛用于需要有创机械通气(IMV)的重症患者的镇静催眠药。尽管丙泊酚具有临床益处,但它与高甘油三酯血症的风险增加有关。早期识别丙泊酚相关高甘油三酯血症风险患者对于优化镇静策略和预防不良后果至关重要。机器学习(ML)模型为预测丙泊酚相关高甘油三酯血症患者的个体化风险提供了一种很有前景的方法。方法和分析 我们建议开发一种 ML 模型,旨在预测接受 IMV 的 ICU 患者发生异丙酚相关高甘油三酯血症的风险。该研究将利用梅奥诊所四个医疗点的回顾性数据。研究将采用嵌套交叉验证 (CV),其中 10 倍 CV 内循环用于模型调整和选择,外循环采用离开一个站点的 CV 进行外部验证。特征选择将使用 Boruta 和 LASSO 惩罚逻辑回归进行。数据预处理步骤包括缺失数据估算、特征缩放和降维技术。将对六种 ML 算法进行调整和评估。贝叶斯优化将用于超参数选择。全局模型的可解释性将使用排列重要性进行评估,局部模型的可解释性将使用 SHAP 进行评估。伦理与传播 拟议的 ML 模型旨在为临床医生提供一种可靠且可解释的工具,用于预测 ICU 患者异丙酚相关高甘油三酯血症的风险。最终模型将部署在基于网络的临床风险计算器中。该模型的开发过程和嵌套交叉验证过程中获得的性能指标将在同行评审期刊上发表。拟议的研究已获得梅奥诊所机构审查委员会的伦理批准(IRB #23-007416)。
Protocol for the development and validation of machine-learning models for predicting the risk of hypertriglyceridemia in critically ill patients receiving propofol sedation using retrospective data.
Introduction: Propofol is a widely used sedative-hypnotic agent for critically-ill patients requiring invasive mechanical ventilation (IMV). Despite its clinical benefits, propofol is associated with increased risks of hypertriglyceridemia. Early identification of patients at risk for propofol-associated hypertriglyceridemia is crucial for optimizing sedation strategies and preventing adverse outcomes. Machine learning (ML) models offer a promising approach for predicting individualized patient risks of propofol-associated hypertriglyceridemia.
Methods and analysis: We propose the development of a ML model aimed at predicting the risk of propofol-associated hypertriglyceridemia in ICU patients receiving IMV. The study will utilize retrospective data from four Mayo Clinic sites. Nested cross-validation (CV) will be employed, with a 10-fold inner CV loop for model tuning and selection as well as an outer loop using leave-one-site-out CV for external validation. Feature selection will be conducted using Boruta and LASSO-penalized logistic regression. Data preprocessing steps include missing data imputation, feature scaling, and dimensionality reduction techniques. Six ML algorithms will be tuned and evaluated. Bayesian optimization will be used for hyperparameter selection. Global model explainability will be assessed using permutation importance, and local model explainability will be assessed using SHapley Additive exPlanations (SHAP).
Ethics and dissemination: The proposed ML model aims to provide a reliable and interpretable tool for clinicians to predict the risk of propofol-associated hypertriglyceridemia in ICU patients. The final model will be deployed in a web-based clinical risk calculator. The model development process and performance measures obtained during nested cross-validation will be described in a study publication to be disseminated in a peer-reviewed journal. The proposed study has received ethics approval from the Mayo Clinic Institutional Review Board (IRB #23-007416).
Strengths and limitations of this study: Robust external validation using a nested cross-validation (CV) framework will help assess the generalizability of models produced from the modeling pipeline across different hospital settings.A diverse set of machine learning (ML) algorithms and advanced hyperparameter tuning techniques will be employed to identify the most optimal model configuration.Integration of feature explainability will enhance the clinical applicability of the ML models by providing transparency in predictions, which can improve clinician trust and encourage adoption.Reliance on retrospective data may introduce biases due to inconsistent or erroneous data collection, and the computational intensity of the validation approach may limit replication and future model expansion in resource-constrained settings.