从初步尿液分析到决策支持：在真实实验室数据中进行尿路感染预测的机器学习。

IF 3 3区医学 Q2 HEALTH CARE SCIENCES & SERVICES

Journal of Personalized Medicine Pub Date : 2025-05-16 DOI:10.3390/jpm15050200

Athanasia Sergounioti, Dimitrios Rigas, Vassilios Zoitopoulos, Dimitrios Kalles

{"title":"从初步尿液分析到决策支持：在真实实验室数据中进行尿路感染预测的机器学习。","authors":"Athanasia Sergounioti, Dimitrios Rigas, Vassilios Zoitopoulos, Dimitrios Kalles","doi":"10.3390/jpm15050200","DOIUrl":null,"url":null,"abstract":"Background/Objectives: Urinary tract infections (UTIs) are frequently diagnosed empirically, often leading to overtreatment and rising antimicrobial resistance. This study aimed to develop and evaluate machine learning (ML) models that predict urine culture outcomes using routine urinalysis and demographic data, supporting more targeted empirical antibiotic use. Methods: A real-world dataset comprising 8065 urinalysis records from a hospital laboratory was used to train five ensemble ML models, including random forest, XGBoost (eXtreme gradient boosting), extra trees, voting classifier, and stacking classifier. Models were developed using 10-fold stratified cross-validation and assessed via clinically relevant metrics including specificity, sensitivity, likelihood ratios, and diagnostic odds ratios (DORs). To enhance screening utility, threshold optimization was applied to the best-performing model (XGBoost) using the Youden index. Results: XGBoost and random forest demonstrated the most balanced diagnostic profiles (AUROC: 0.819 and 0.791, respectively), with DORs exceeding 21. The voting and stacking classifiers achieved the highest specificity (>95%) and positive likelihood ratios (>10) but exhibited lower sensitivity. Feature importance analysis identified positive nitrites, white blood cell count, and specific gravity as key predictors. Threshold tuning of XGBoost improved sensitivity from 70.2% to 87.9% and reduced false negatives by 82%, with an associated NPV of 96.4%. The adjusted model reduced overtreatment by 56% compared to empirical prescribing. Conclusions: ML models based on structured urinalysis and demographic data can support clinical decision-making for UTIs. While high-specificity models may reduce unnecessary antibiotic use, sensitivity trade-offs must be considered. Threshold-optimized XGBoost offers a clinically adaptable tool for empirical treatment decisions by improving sensitivity and reducing overtreatment, thus supporting the more personalized and judicious use of antibiotics.","PeriodicalId":16722,"journal":{"name":"Journal of Personalized Medicine","volume":"15 5","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113611/pdf/","citationCount":"0","resultStr":"{\"title\":\"From Preliminary Urinalysis to Decision Support: Machine Learning for UTI Prediction in Real-World Laboratory Data.\",\"authors\":\"Athanasia Sergounioti, Dimitrios Rigas, Vassilios Zoitopoulos, Dimitrios Kalles\",\"doi\":\"10.3390/jpm15050200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background/Objectives: Urinary tract infections (UTIs) are frequently diagnosed empirically, often leading to overtreatment and rising antimicrobial resistance. This study aimed to develop and evaluate machine learning (ML) models that predict urine culture outcomes using routine urinalysis and demographic data, supporting more targeted empirical antibiotic use. Methods: A real-world dataset comprising 8065 urinalysis records from a hospital laboratory was used to train five ensemble ML models, including random forest, XGBoost (eXtreme gradient boosting), extra trees, voting classifier, and stacking classifier. Models were developed using 10-fold stratified cross-validation and assessed via clinically relevant metrics including specificity, sensitivity, likelihood ratios, and diagnostic odds ratios (DORs). To enhance screening utility, threshold optimization was applied to the best-performing model (XGBoost) using the Youden index. Results: XGBoost and random forest demonstrated the most balanced diagnostic profiles (AUROC: 0.819 and 0.791, respectively), with DORs exceeding 21. The voting and stacking classifiers achieved the highest specificity (>95%) and positive likelihood ratios (>10) but exhibited lower sensitivity. Feature importance analysis identified positive nitrites, white blood cell count, and specific gravity as key predictors. Threshold tuning of XGBoost improved sensitivity from 70.2% to 87.9% and reduced false negatives by 82%, with an associated NPV of 96.4%. The adjusted model reduced overtreatment by 56% compared to empirical prescribing. Conclusions: ML models based on structured urinalysis and demographic data can support clinical decision-making for UTIs. While high-specificity models may reduce unnecessary antibiotic use, sensitivity trade-offs must be considered. Threshold-optimized XGBoost offers a clinically adaptable tool for empirical treatment decisions by improving sensitivity and reducing overtreatment, thus supporting the more personalized and judicious use of antibiotics.\",\"PeriodicalId\":16722,\"journal\":{\"name\":\"Journal of Personalized Medicine\",\"volume\":\"15 5\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113611/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Personalized Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/jpm15050200\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Personalized Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/jpm15050200","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

背景/目的：尿路感染（uti）经常是经验性诊断，往往导致过度治疗和抗菌素耐药性上升。本研究旨在开发和评估机器学习（ML）模型，该模型使用常规尿液分析和人口统计数据预测尿液培养结果，支持更有针对性的经验性抗生素使用。方法：使用来自医院实验室的包含8065条尿液分析记录的真实数据集来训练5个集成ML模型，包括随机森林、XGBoost （eXtreme gradient boosting）、额外树、投票分类器和堆叠分类器。采用10倍分层交叉验证建立模型，并通过临床相关指标进行评估，包括特异性、敏感性、似然比和诊断优势比（DORs）。为了提高筛选效用，使用约登指数对表现最佳的模型（XGBoost）应用阈值优化。结果：XGBoost和random forest的诊断曲线最为平衡（AUROC分别为0.819和0.791），DORs均超过21。投票和堆叠分类器获得了最高的特异性（>95%）和正似然比（>10），但灵敏度较低。特征重要性分析确定阳性亚硝酸盐、白细胞计数和比重为关键预测因子。XGBoost的阈值调整将灵敏度从70.2%提高到87.9%，并将假阴性降低82%，相关NPV为96.4%。与经验处方相比，调整后的模型减少了56%的过度治疗。结论：基于结构化尿液分析和人口统计数据的ML模型可以支持尿路感染的临床决策。虽然高特异性模型可以减少不必要的抗生素使用，但必须考虑敏感性的权衡。阈值优化的XGBoost通过提高敏感性和减少过度治疗，为经验性治疗决策提供了临床适应性工具，从而支持更加个性化和明智地使用抗生素。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

From Preliminary Urinalysis to Decision Support: Machine Learning for UTI Prediction in Real-World Laboratory Data.

Background/Objectives: Urinary tract infections (UTIs) are frequently diagnosed empirically, often leading to overtreatment and rising antimicrobial resistance. This study aimed to develop and evaluate machine learning (ML) models that predict urine culture outcomes using routine urinalysis and demographic data, supporting more targeted empirical antibiotic use. Methods: A real-world dataset comprising 8065 urinalysis records from a hospital laboratory was used to train five ensemble ML models, including random forest, XGBoost (eXtreme gradient boosting), extra trees, voting classifier, and stacking classifier. Models were developed using 10-fold stratified cross-validation and assessed via clinically relevant metrics including specificity, sensitivity, likelihood ratios, and diagnostic odds ratios (DORs). To enhance screening utility, threshold optimization was applied to the best-performing model (XGBoost) using the Youden index. Results: XGBoost and random forest demonstrated the most balanced diagnostic profiles (AUROC: 0.819 and 0.791, respectively), with DORs exceeding 21. The voting and stacking classifiers achieved the highest specificity (>95%) and positive likelihood ratios (>10) but exhibited lower sensitivity. Feature importance analysis identified positive nitrites, white blood cell count, and specific gravity as key predictors. Threshold tuning of XGBoost improved sensitivity from 70.2% to 87.9% and reduced false negatives by 82%, with an associated NPV of 96.4%. The adjusted model reduced overtreatment by 56% compared to empirical prescribing. Conclusions: ML models based on structured urinalysis and demographic data can support clinical decision-making for UTIs. While high-specificity models may reduce unnecessary antibiotic use, sensitivity trade-offs must be considered. Threshold-optimized XGBoost offers a clinically adaptable tool for empirical treatment decisions by improving sensitivity and reducing overtreatment, thus supporting the more personalized and judicious use of antibiotics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Personalized Medicine Medicine-Medicine (miscellaneous)

CiteScore

4.10

自引率

0.00%

发文量

1878

审稿时长

11 weeks

期刊介绍： Journal of Personalized Medicine (JPM; ISSN 2075-4426) is an international, open access journal aimed at bringing all aspects of personalized medicine to one platform. JPM publishes cutting edge, innovative preclinical and translational scientific research and technologies related to personalized medicine (e.g., pharmacogenomics/proteomics, systems biology). JPM recognizes that personalized medicine—the assessment of genetic, environmental and host factors that cause variability of individuals—is a challenging, transdisciplinary topic that requires discussions from a range of experts. For a comprehensive perspective of personalized medicine, JPM aims to integrate expertise from the molecular and translational sciences, therapeutics and diagnostics, as well as discussions of regulatory, social, ethical and policy aspects. We provide a forum to bring together academic and clinical researchers, biotechnology, diagnostic and pharmaceutical companies, health professionals, regulatory and ethical experts, and government and regulatory authorities.