Athanasia Sergounioti, Dimitrios Rigas, Vassilios Zoitopoulos, Dimitrios Kalles
{"title":"从初步尿液分析到决策支持:在真实实验室数据中进行尿路感染预测的机器学习。","authors":"Athanasia Sergounioti, Dimitrios Rigas, Vassilios Zoitopoulos, Dimitrios Kalles","doi":"10.3390/jpm15050200","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background/Objectives</b>: Urinary tract infections (UTIs) are frequently diagnosed empirically, often leading to overtreatment and rising antimicrobial resistance. This study aimed to develop and evaluate machine learning (ML) models that predict urine culture outcomes using routine urinalysis and demographic data, supporting more targeted empirical antibiotic use. <b>Methods</b>: A real-world dataset comprising 8065 urinalysis records from a hospital laboratory was used to train five ensemble ML models, including random forest, XGBoost (eXtreme gradient boosting), extra trees, voting classifier, and stacking classifier. Models were developed using 10-fold stratified cross-validation and assessed via clinically relevant metrics including specificity, sensitivity, likelihood ratios, and diagnostic odds ratios (DORs). To enhance screening utility, threshold optimization was applied to the best-performing model (XGBoost) using the Youden index. <b>Results</b>: XGBoost and random forest demonstrated the most balanced diagnostic profiles (AUROC: 0.819 and 0.791, respectively), with DORs exceeding 21. The voting and stacking classifiers achieved the highest specificity (>95%) and positive likelihood ratios (>10) but exhibited lower sensitivity. Feature importance analysis identified positive nitrites, white blood cell count, and specific gravity as key predictors. Threshold tuning of XGBoost improved sensitivity from 70.2% to 87.9% and reduced false negatives by 82%, with an associated NPV of 96.4%. The adjusted model reduced overtreatment by 56% compared to empirical prescribing. <b>Conclusions</b>: ML models based on structured urinalysis and demographic data can support clinical decision-making for UTIs. While high-specificity models may reduce unnecessary antibiotic use, sensitivity trade-offs must be considered. Threshold-optimized XGBoost offers a clinically adaptable tool for empirical treatment decisions by improving sensitivity and reducing overtreatment, thus supporting the more personalized and judicious use of antibiotics.</p>","PeriodicalId":16722,"journal":{"name":"Journal of Personalized Medicine","volume":"15 5","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113611/pdf/","citationCount":"0","resultStr":"{\"title\":\"From Preliminary Urinalysis to Decision Support: Machine Learning for UTI Prediction in Real-World Laboratory Data.\",\"authors\":\"Athanasia Sergounioti, Dimitrios Rigas, Vassilios Zoitopoulos, Dimitrios Kalles\",\"doi\":\"10.3390/jpm15050200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background/Objectives</b>: Urinary tract infections (UTIs) are frequently diagnosed empirically, often leading to overtreatment and rising antimicrobial resistance. This study aimed to develop and evaluate machine learning (ML) models that predict urine culture outcomes using routine urinalysis and demographic data, supporting more targeted empirical antibiotic use. <b>Methods</b>: A real-world dataset comprising 8065 urinalysis records from a hospital laboratory was used to train five ensemble ML models, including random forest, XGBoost (eXtreme gradient boosting), extra trees, voting classifier, and stacking classifier. Models were developed using 10-fold stratified cross-validation and assessed via clinically relevant metrics including specificity, sensitivity, likelihood ratios, and diagnostic odds ratios (DORs). To enhance screening utility, threshold optimization was applied to the best-performing model (XGBoost) using the Youden index. <b>Results</b>: XGBoost and random forest demonstrated the most balanced diagnostic profiles (AUROC: 0.819 and 0.791, respectively), with DORs exceeding 21. The voting and stacking classifiers achieved the highest specificity (>95%) and positive likelihood ratios (>10) but exhibited lower sensitivity. Feature importance analysis identified positive nitrites, white blood cell count, and specific gravity as key predictors. Threshold tuning of XGBoost improved sensitivity from 70.2% to 87.9% and reduced false negatives by 82%, with an associated NPV of 96.4%. The adjusted model reduced overtreatment by 56% compared to empirical prescribing. <b>Conclusions</b>: ML models based on structured urinalysis and demographic data can support clinical decision-making for UTIs. While high-specificity models may reduce unnecessary antibiotic use, sensitivity trade-offs must be considered. Threshold-optimized XGBoost offers a clinically adaptable tool for empirical treatment decisions by improving sensitivity and reducing overtreatment, thus supporting the more personalized and judicious use of antibiotics.</p>\",\"PeriodicalId\":16722,\"journal\":{\"name\":\"Journal of Personalized Medicine\",\"volume\":\"15 5\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113611/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Personalized Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/jpm15050200\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Personalized Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/jpm15050200","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
From Preliminary Urinalysis to Decision Support: Machine Learning for UTI Prediction in Real-World Laboratory Data.
Background/Objectives: Urinary tract infections (UTIs) are frequently diagnosed empirically, often leading to overtreatment and rising antimicrobial resistance. This study aimed to develop and evaluate machine learning (ML) models that predict urine culture outcomes using routine urinalysis and demographic data, supporting more targeted empirical antibiotic use. Methods: A real-world dataset comprising 8065 urinalysis records from a hospital laboratory was used to train five ensemble ML models, including random forest, XGBoost (eXtreme gradient boosting), extra trees, voting classifier, and stacking classifier. Models were developed using 10-fold stratified cross-validation and assessed via clinically relevant metrics including specificity, sensitivity, likelihood ratios, and diagnostic odds ratios (DORs). To enhance screening utility, threshold optimization was applied to the best-performing model (XGBoost) using the Youden index. Results: XGBoost and random forest demonstrated the most balanced diagnostic profiles (AUROC: 0.819 and 0.791, respectively), with DORs exceeding 21. The voting and stacking classifiers achieved the highest specificity (>95%) and positive likelihood ratios (>10) but exhibited lower sensitivity. Feature importance analysis identified positive nitrites, white blood cell count, and specific gravity as key predictors. Threshold tuning of XGBoost improved sensitivity from 70.2% to 87.9% and reduced false negatives by 82%, with an associated NPV of 96.4%. The adjusted model reduced overtreatment by 56% compared to empirical prescribing. Conclusions: ML models based on structured urinalysis and demographic data can support clinical decision-making for UTIs. While high-specificity models may reduce unnecessary antibiotic use, sensitivity trade-offs must be considered. Threshold-optimized XGBoost offers a clinically adaptable tool for empirical treatment decisions by improving sensitivity and reducing overtreatment, thus supporting the more personalized and judicious use of antibiotics.
期刊介绍:
Journal of Personalized Medicine (JPM; ISSN 2075-4426) is an international, open access journal aimed at bringing all aspects of personalized medicine to one platform. JPM publishes cutting edge, innovative preclinical and translational scientific research and technologies related to personalized medicine (e.g., pharmacogenomics/proteomics, systems biology). JPM recognizes that personalized medicine—the assessment of genetic, environmental and host factors that cause variability of individuals—is a challenging, transdisciplinary topic that requires discussions from a range of experts. For a comprehensive perspective of personalized medicine, JPM aims to integrate expertise from the molecular and translational sciences, therapeutics and diagnostics, as well as discussions of regulatory, social, ethical and policy aspects. We provide a forum to bring together academic and clinical researchers, biotechnology, diagnostic and pharmaceutical companies, health professionals, regulatory and ethical experts, and government and regulatory authorities.