{"title":"与其他机器学习算法相比,在预测骨质疏松性髋部骨折院内死亡率方面,奈维贝叶斯是一种可解释、可预测的机器学习算法","authors":"Jo-Wai Douglas Wang","doi":"10.1101/2024.05.10.24307161","DOIUrl":null,"url":null,"abstract":"Osteoporotic hip fractures (HFs) in the elderly are a pertinent issue in healthcare, particularly in developed countries such as Australia. Estimating prognosis following admission remains a key challenge. Current predictive tools require numerous patient input features including those unavailable early in admission. Moreover, attempts to explain machine learning [ML]-based predictions are lacking. We developed 7 ML prognostication models to predict in-hospital mortality following minimal trauma HF in those aged ≥ 65 years of age, requiring only sociodemographic and comorbidity data as input. Hyperparameter tuning was performed via fractional factorial design of experiments combined with grid search; models were evaluated with 5-fold cross-validation and area under the receiver operating characteristic curve (AUROC). For explainability, ML models were directly interpreted as well as analyzed with SHAP values. Top performing models were random forests, naïve Bayes [NB], extreme gradient boosting, and logistic regression (AUROCs ranging 0.682 – 0.696, p>0.05). Interpretation of models found the most important features were chronic kidney disease, cardiovascular comorbidities and markers of bone metabolism; NB also offers direct intuitive interpretation. Overall, we conclude that NB has much potential as an algorithm, due to its simplicity and interpretability whilst maintaining competitive predictive performance.","PeriodicalId":501025,"journal":{"name":"medRxiv - Geriatric Medicine","volume":"159 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Naïve Bayes is an interpretable and predictive machine learning algorithm in predicting osteoporotic hip fracture in-hospital mortality compared to other machine learning algorithms\",\"authors\":\"Jo-Wai Douglas Wang\",\"doi\":\"10.1101/2024.05.10.24307161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Osteoporotic hip fractures (HFs) in the elderly are a pertinent issue in healthcare, particularly in developed countries such as Australia. Estimating prognosis following admission remains a key challenge. Current predictive tools require numerous patient input features including those unavailable early in admission. Moreover, attempts to explain machine learning [ML]-based predictions are lacking. We developed 7 ML prognostication models to predict in-hospital mortality following minimal trauma HF in those aged ≥ 65 years of age, requiring only sociodemographic and comorbidity data as input. Hyperparameter tuning was performed via fractional factorial design of experiments combined with grid search; models were evaluated with 5-fold cross-validation and area under the receiver operating characteristic curve (AUROC). For explainability, ML models were directly interpreted as well as analyzed with SHAP values. Top performing models were random forests, naïve Bayes [NB], extreme gradient boosting, and logistic regression (AUROCs ranging 0.682 – 0.696, p>0.05). Interpretation of models found the most important features were chronic kidney disease, cardiovascular comorbidities and markers of bone metabolism; NB also offers direct intuitive interpretation. Overall, we conclude that NB has much potential as an algorithm, due to its simplicity and interpretability whilst maintaining competitive predictive performance.\",\"PeriodicalId\":501025,\"journal\":{\"name\":\"medRxiv - Geriatric Medicine\",\"volume\":\"159 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Geriatric Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.05.10.24307161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Geriatric Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.05.10.24307161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
老年人骨质疏松性髋部骨折(HFs)是医疗保健领域的一个相关问题,尤其是在澳大利亚等发达国家。估计入院后的预后仍是一项关键挑战。目前的预测工具需要大量的患者输入特征,包括入院早期无法获得的特征。此外,基于机器学习[ML]的预测也缺乏解释。我们开发了 7 个 ML 预测模型,用于预测年龄≥ 65 岁的轻微创伤高频患者的院内死亡率,只需输入社会人口学和合并症数据。超参数调整通过分数因子实验设计结合网格搜索进行;模型通过 5 倍交叉验证和接收者操作特征曲线下面积(AUROC)进行评估。在可解释性方面,直接解释了 ML 模型,并用 SHAP 值进行了分析。表现最好的模型是随机森林、天真贝叶斯[NB]、极梯度提升和逻辑回归(AUROC 在 0.682 - 0.696 之间,p>0.05)。对模型的解释发现,最重要的特征是慢性肾病、心血管并发症和骨代谢标志物;NB 也提供了直接直观的解释。总之,我们得出的结论是,NB 作为一种算法具有很大的潜力,因为它既简单又可解释,同时还能保持有竞争力的预测性能。
Naïve Bayes is an interpretable and predictive machine learning algorithm in predicting osteoporotic hip fracture in-hospital mortality compared to other machine learning algorithms
Osteoporotic hip fractures (HFs) in the elderly are a pertinent issue in healthcare, particularly in developed countries such as Australia. Estimating prognosis following admission remains a key challenge. Current predictive tools require numerous patient input features including those unavailable early in admission. Moreover, attempts to explain machine learning [ML]-based predictions are lacking. We developed 7 ML prognostication models to predict in-hospital mortality following minimal trauma HF in those aged ≥ 65 years of age, requiring only sociodemographic and comorbidity data as input. Hyperparameter tuning was performed via fractional factorial design of experiments combined with grid search; models were evaluated with 5-fold cross-validation and area under the receiver operating characteristic curve (AUROC). For explainability, ML models were directly interpreted as well as analyzed with SHAP values. Top performing models were random forests, naïve Bayes [NB], extreme gradient boosting, and logistic regression (AUROCs ranging 0.682 – 0.696, p>0.05). Interpretation of models found the most important features were chronic kidney disease, cardiovascular comorbidities and markers of bone metabolism; NB also offers direct intuitive interpretation. Overall, we conclude that NB has much potential as an algorithm, due to its simplicity and interpretability whilst maintaining competitive predictive performance.