{"title":"基于机器学习的肺炎克雷伯菌肝脓肿预测模型的建立","authors":"Haoran Li, Yan Yu, Xi Chen, Qingqing Sun, Xiumin Li, Qiujing Shang, Minghua Ying, Xiulin Liu, Jing Meng, Lele Bian, Shanshan Wu, Yuejuan Gao","doi":"10.2147/IDR.S545440","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To investigate the clinical and ultrasonographic characteristics of pyogenic liver abscess (PLA) caused by <i>Klebsiella pneumoniae</i> (K-PLA) and <i>non-Klebsiella pneumoniae</i> pathogens, and to develop machine learning models for the differential diagnosis of K-PLA.</p><p><strong>Materials and methods: </strong>In this retrospective study, patients clinically diagnosed with PLA and confirmed by ultrasound-guided puncture at the Fifth Medical Center of PLA General Hospital between April 2013 and December 2020 were enrolled. Based on the causative pathogens, patients were categorized into K-PLA and non-K-PLA groups. Baseline data, including ultrasonographic features, clinical characteristics, and laboratory findings, were collected. The Boruta algorithm was employed for feature selection, and four machine learning models-Deep Learning-Fully Connected Neural Network (deeplearning), Distributed Random Forest (drf), Gradient Boosting Machine (gbm), and Generalized Linear Model (glm)-were developed to diagnose K-PLA. The models were validated using 5-fold cross-validation.</p><p><strong>Results: </strong>A total of 201 patients with bacterial liver abscess were included (median age: 57 years; range: 49-66; 136 males), comprising 134 K-PLA cases and 67 non-K-PLA cases. The Boruta algorithm identified seven significant predictive variables: history of diabetes, history of hepatocellular carcinoma, history of biliary tract disease, history of infectious diseases, duration of fever, body temperature, and alanine aminotransferase (ALT) levels. Using these variables, the four machine learning models were constructed. In the training set, the area under the receiver operating characteristic curve (AUC) for predicting K-PLA was 0.716 (deeplearning), 0.999 (drf), 0.922 (gbm), and 0.718 (glm). In the validation set, the corresponding AUC values were 0.799, 0.763, 0.848, and 0.805, respectively.</p><p><strong>Conclusion: </strong>This study successfully established four machine learning models for predicting the risk of K-PLA, with the gbm-based model demonstrating the highest diagnostic performance. These models may facilitate early clinical diagnosis and treatment of K-PLA, thereby reducing antibiotic misuse.</p>","PeriodicalId":13577,"journal":{"name":"Infection and Drug Resistance","volume":"18 ","pages":"5097-5108"},"PeriodicalIF":2.9000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12477283/pdf/","citationCount":"0","resultStr":"{\"title\":\"Establishment of a Machine Learning-Based Predictive Model for <i>Klebsiella pneumoniae</i> Liver Abscess.\",\"authors\":\"Haoran Li, Yan Yu, Xi Chen, Qingqing Sun, Xiumin Li, Qiujing Shang, Minghua Ying, Xiulin Liu, Jing Meng, Lele Bian, Shanshan Wu, Yuejuan Gao\",\"doi\":\"10.2147/IDR.S545440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To investigate the clinical and ultrasonographic characteristics of pyogenic liver abscess (PLA) caused by <i>Klebsiella pneumoniae</i> (K-PLA) and <i>non-Klebsiella pneumoniae</i> pathogens, and to develop machine learning models for the differential diagnosis of K-PLA.</p><p><strong>Materials and methods: </strong>In this retrospective study, patients clinically diagnosed with PLA and confirmed by ultrasound-guided puncture at the Fifth Medical Center of PLA General Hospital between April 2013 and December 2020 were enrolled. Based on the causative pathogens, patients were categorized into K-PLA and non-K-PLA groups. Baseline data, including ultrasonographic features, clinical characteristics, and laboratory findings, were collected. The Boruta algorithm was employed for feature selection, and four machine learning models-Deep Learning-Fully Connected Neural Network (deeplearning), Distributed Random Forest (drf), Gradient Boosting Machine (gbm), and Generalized Linear Model (glm)-were developed to diagnose K-PLA. The models were validated using 5-fold cross-validation.</p><p><strong>Results: </strong>A total of 201 patients with bacterial liver abscess were included (median age: 57 years; range: 49-66; 136 males), comprising 134 K-PLA cases and 67 non-K-PLA cases. The Boruta algorithm identified seven significant predictive variables: history of diabetes, history of hepatocellular carcinoma, history of biliary tract disease, history of infectious diseases, duration of fever, body temperature, and alanine aminotransferase (ALT) levels. Using these variables, the four machine learning models were constructed. In the training set, the area under the receiver operating characteristic curve (AUC) for predicting K-PLA was 0.716 (deeplearning), 0.999 (drf), 0.922 (gbm), and 0.718 (glm). In the validation set, the corresponding AUC values were 0.799, 0.763, 0.848, and 0.805, respectively.</p><p><strong>Conclusion: </strong>This study successfully established four machine learning models for predicting the risk of K-PLA, with the gbm-based model demonstrating the highest diagnostic performance. These models may facilitate early clinical diagnosis and treatment of K-PLA, thereby reducing antibiotic misuse.</p>\",\"PeriodicalId\":13577,\"journal\":{\"name\":\"Infection and Drug Resistance\",\"volume\":\"18 \",\"pages\":\"5097-5108\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12477283/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infection and Drug Resistance\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2147/IDR.S545440\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infection and Drug Resistance","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/IDR.S545440","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0
摘要
目的:探讨由肺炎克雷伯菌(K-PLA)和非肺炎克雷伯菌病原菌引起的化脓性肝脓肿(PLA)的临床和超声特征,并建立用于K-PLA鉴别诊断的机器学习模型。材料与方法:本回顾性研究纳入2013年4月至2020年12月在解放军总医院第五医学中心经超声引导穿刺确诊的PLA患者。根据致病菌分为K-PLA组和非K-PLA组。收集基线数据,包括超声特征、临床特征和实验室结果。采用Boruta算法进行特征选择,并开发了深度学习-全连接神经网络(deep - Connected Neural Network, deep - learning)、分布式随机森林(Distributed Random Forest, drf)、梯度增强机(Gradient Boosting machine, gbm)和广义线性模型(Generalized Linear Model, glm)四种机器学习模型来诊断K-PLA。采用5倍交叉验证对模型进行验证。结果:共纳入细菌性肝脓肿201例(中位年龄57岁,49 ~ 66岁,男性136例),其中K-PLA 134例,非K-PLA 67例。Boruta算法确定了7个重要的预测变量:糖尿病史、肝细胞癌史、胆道疾病史、传染病史、发烧持续时间、体温和丙氨酸转氨酶(ALT)水平。利用这些变量,构建了四种机器学习模型。在训练集中,预测K-PLA的受试者工作特征曲线下面积(AUC)分别为0.716 (deep - planning)、0.999 (drf)、0.922 (gbm)和0.718 (glm)。在验证集中,相应的AUC值分别为0.799、0.763、0.848和0.805。结论:本研究成功建立了4种预测K-PLA风险的机器学习模型,其中基于gbm的模型诊断效果最好。这些模型有助于K-PLA的早期临床诊断和治疗,从而减少抗生素的滥用。
Establishment of a Machine Learning-Based Predictive Model for Klebsiella pneumoniae Liver Abscess.
Purpose: To investigate the clinical and ultrasonographic characteristics of pyogenic liver abscess (PLA) caused by Klebsiella pneumoniae (K-PLA) and non-Klebsiella pneumoniae pathogens, and to develop machine learning models for the differential diagnosis of K-PLA.
Materials and methods: In this retrospective study, patients clinically diagnosed with PLA and confirmed by ultrasound-guided puncture at the Fifth Medical Center of PLA General Hospital between April 2013 and December 2020 were enrolled. Based on the causative pathogens, patients were categorized into K-PLA and non-K-PLA groups. Baseline data, including ultrasonographic features, clinical characteristics, and laboratory findings, were collected. The Boruta algorithm was employed for feature selection, and four machine learning models-Deep Learning-Fully Connected Neural Network (deeplearning), Distributed Random Forest (drf), Gradient Boosting Machine (gbm), and Generalized Linear Model (glm)-were developed to diagnose K-PLA. The models were validated using 5-fold cross-validation.
Results: A total of 201 patients with bacterial liver abscess were included (median age: 57 years; range: 49-66; 136 males), comprising 134 K-PLA cases and 67 non-K-PLA cases. The Boruta algorithm identified seven significant predictive variables: history of diabetes, history of hepatocellular carcinoma, history of biliary tract disease, history of infectious diseases, duration of fever, body temperature, and alanine aminotransferase (ALT) levels. Using these variables, the four machine learning models were constructed. In the training set, the area under the receiver operating characteristic curve (AUC) for predicting K-PLA was 0.716 (deeplearning), 0.999 (drf), 0.922 (gbm), and 0.718 (glm). In the validation set, the corresponding AUC values were 0.799, 0.763, 0.848, and 0.805, respectively.
Conclusion: This study successfully established four machine learning models for predicting the risk of K-PLA, with the gbm-based model demonstrating the highest diagnostic performance. These models may facilitate early clinical diagnosis and treatment of K-PLA, thereby reducing antibiotic misuse.
期刊介绍:
About Journal
Editors
Peer Reviewers
Articles
Article Publishing Charges
Aims and Scope
Call For Papers
ISSN: 1178-6973
Editor-in-Chief: Professor Suresh Antony
An international, peer-reviewed, open access journal that focuses on the optimal treatment of infection (bacterial, fungal and viral) and the development and institution of preventative strategies to minimize the development and spread of resistance.