Investigating perioperative pressure injuries and factors influencing them with imbalanced samples using a Synthetic Minority Over-sampling Technique.

IF 5.7 4区 生物学 Q1 BIOLOGY
Bioscience trends Pub Date : 2025-05-09 Epub Date: 2025-04-15 DOI:10.5582/bst.2025.01013
Yiwei Zhou, Jian Wu, Xin Xu, Guirong Shi, Ping Liu, Liping Jiang
{"title":"Investigating perioperative pressure injuries and factors influencing them with imbalanced samples using a Synthetic Minority Over-sampling Technique.","authors":"Yiwei Zhou, Jian Wu, Xin Xu, Guirong Shi, Ping Liu, Liping Jiang","doi":"10.5582/bst.2025.01013","DOIUrl":null,"url":null,"abstract":"<p><p>This study investigates the use of machine learning (ML) models combined with a Synthetic Minority Over-sampling Technique (SMOTE) and its variants to predict perioperative pressure injuries (PIs) in an imbalanced dataset. PIs are a significant healthcare problem, often leading to prolonged hospitalization and increased medical costs. Conventional risk assessment scales are limited in their ability to predict PIs accurately, prompting the exploration of ML techniques to address this challenge.We utilized data from 7,292 patients admitted to a tertiary care hospital in Shanghai between May 2017 and July 2023, with a final dataset of 2,972 patients, including 158 with PIs. Seven ML algorithms-Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Extra Trees (ET), K-Nearest Neighbors (KNN), and Decision Trees (DT)-were used in conjunction with SMOTE, SMOTE+ENN, Borderline-SMOTE, ADASYN, and GAN to balance the dataset and improve model performance.Results revealed significant improvements in model performance when SMOTE and its variants were used. For instance, the XGBoost model hadan AUC of 0.996 with SMOTE, compared to 0.800 on raw data. SMOTE+ENN and Borderline-SMOTE further enhanced the models' ability to identify minority classes. External validation indicatedthat XGBoost, RF, and ET exhibited the highest stability and accuracy, with XGBoost having an AUC of 0.977. SHAP analysis revealed that factors such as anesthesia grade, age, and serum albumin levels significantly influenced model predictions.In conclusion, integrating SMOTE with ML algorithms effectively addressed a data imbalance and improved the prediction of perioperative PIs. Future work should focus on refining SMOTE techniques and exploring their application to larger, multi-center datasets to enhance the generalizability of these findings, and especially for diseaseswith a lowincidence.</p>","PeriodicalId":8957,"journal":{"name":"Bioscience trends","volume":"19 2","pages":"173-188"},"PeriodicalIF":5.7000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioscience trends","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.5582/bst.2025.01013","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/15 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

This study investigates the use of machine learning (ML) models combined with a Synthetic Minority Over-sampling Technique (SMOTE) and its variants to predict perioperative pressure injuries (PIs) in an imbalanced dataset. PIs are a significant healthcare problem, often leading to prolonged hospitalization and increased medical costs. Conventional risk assessment scales are limited in their ability to predict PIs accurately, prompting the exploration of ML techniques to address this challenge.We utilized data from 7,292 patients admitted to a tertiary care hospital in Shanghai between May 2017 and July 2023, with a final dataset of 2,972 patients, including 158 with PIs. Seven ML algorithms-Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Extra Trees (ET), K-Nearest Neighbors (KNN), and Decision Trees (DT)-were used in conjunction with SMOTE, SMOTE+ENN, Borderline-SMOTE, ADASYN, and GAN to balance the dataset and improve model performance.Results revealed significant improvements in model performance when SMOTE and its variants were used. For instance, the XGBoost model hadan AUC of 0.996 with SMOTE, compared to 0.800 on raw data. SMOTE+ENN and Borderline-SMOTE further enhanced the models' ability to identify minority classes. External validation indicatedthat XGBoost, RF, and ET exhibited the highest stability and accuracy, with XGBoost having an AUC of 0.977. SHAP analysis revealed that factors such as anesthesia grade, age, and serum albumin levels significantly influenced model predictions.In conclusion, integrating SMOTE with ML algorithms effectively addressed a data imbalance and improved the prediction of perioperative PIs. Future work should focus on refining SMOTE techniques and exploring their application to larger, multi-center datasets to enhance the generalizability of these findings, and especially for diseaseswith a lowincidence.

应用合成少数过采样技术研究不平衡样本围手术期压力损伤及其影响因素。
本研究探讨了机器学习(ML)模型结合合成少数过采样技术(SMOTE)及其变体在不平衡数据集中预测围手术期压力损伤(pi)的使用。pi是一个严重的保健问题,经常导致住院时间延长和医疗费用增加。传统的风险评估量表在准确预测pi方面的能力有限,这促使ML技术的探索来应对这一挑战。我们使用了2017年5月至2023年7月期间入住上海一家三级医院的7292名患者的数据,最终数据集为2972名患者,其中包括158名pi患者。七种ML算法——支持向量机(SVM)、逻辑回归(LR)、随机森林(RF)、极端梯度增强(XGBoost)、额外树(ET)、k -近邻(KNN)和决策树(DT)——与SMOTE、SMOTE+ENN、Borderline-SMOTE、ADASYN和GAN一起使用,以平衡数据集并提高模型性能。结果显示,当使用SMOTE及其变体时,模型性能有显著改善。例如,XGBoost模型在SMOTE上的AUC为0.996,而在原始数据上为0.800。SMOTE+ENN和Borderline-SMOTE进一步增强了模型识别少数族裔的能力。外部验证表明,XGBoost、RF和ET具有最高的稳定性和准确性,其中XGBoost的AUC为0.977。SHAP分析显示,麻醉等级、年龄和血清白蛋白水平等因素显著影响模型预测。综上所述,SMOTE与ML算法的结合有效地解决了数据不平衡问题,提高了围手术期pi的预测。未来的工作应侧重于完善SMOTE技术,并探索其在更大的多中心数据集上的应用,以增强这些发现的普遍性,特别是对于低发病率的疾病。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
13.60
自引率
1.80%
发文量
47
审稿时长
>12 weeks
期刊介绍: BioScience Trends (Print ISSN 1881-7815, Online ISSN 1881-7823) is an international peer-reviewed journal. BioScience Trends devotes to publishing the latest and most exciting advances in scientific research. Articles cover fields of life science such as biochemistry, molecular biology, clinical research, public health, medical care system, and social science in order to encourage cooperation and exchange among scientists and clinical researchers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信