全血细胞计数数据在基于机器学习的错血检测中的分析物重要性分析。

IF 3 3区医学 Q2 HEALTH CARE SCIENCES & SERVICES

Journal of Personalized Medicine Pub Date : 2025-09-01 DOI:10.3390/jpm15090404

Barış Gün Sürmeli, René Staritzbichler, Clemens Ringel, Saleem Al-Dakkak, Helene Dörksen, Thorsten Kaiser

{"title":"全血细胞计数数据在基于机器学习的错血检测中的分析物重要性分析。","authors":"Barış Gün Sürmeli, René Staritzbichler, Clemens Ringel, Saleem Al-Dakkak, Helene Dörksen, Thorsten Kaiser","doi":"10.3390/jpm15090404","DOIUrl":null,"url":null,"abstract":"Background: Wrong blood in tube (WBIT) is a critical pre-analytical error in laboratory medicine in which a blood sample is mislabeled with the wrong patient identity. These errors are often undetected due to the limitations of current detection strategies (e.g., delta checks). Methods: We evaluated Random Forest models for WBIT detection and conducted a detailed analyte importance analysis. In total, 799,721 samples from a German tertiary care center were analyzed and filtered for applicability. Model input features were derived by pairing consecutive same-patient samples for non-WBIT cases, simulating WBIT by pairing samples from different patients, and computing per-analyte first-order differences for each pair. We exhaustively searched all subsets of nine CBC analytes and evaluated models using F1 score, AUC, sensitivity, and PPV. Analyte importance was assessed via SHAP, permutation, and impurity decrease. Results: Models using as few as three analytes (MCV, RDW, MCH) reached F1 scores above 90%, with performance plateauing beyond six analytes. MCV and RDW were consistently top-ranked. Two-dimensional and three-dimensional visualizations revealed interpretable decision boundaries. Conclusions: Findings demonstrate that robust WBIT detection is achievable using a minimal subset of CBC analytes, offering a practical, interpretable, and broadly generalizable ML-based solution suitable for diverse clinical environments.","PeriodicalId":16722,"journal":{"name":"Journal of Personalized Medicine","volume":"15 9","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12470999/pdf/","citationCount":"0","resultStr":"{\"title\":\"Analyte Importance Analysis in Machine Learning-Based Detection of Wrong-Blood-in-Tube Errors Using Complete Blood Count Data.\",\"authors\":\"Barış Gün Sürmeli, René Staritzbichler, Clemens Ringel, Saleem Al-Dakkak, Helene Dörksen, Thorsten Kaiser\",\"doi\":\"10.3390/jpm15090404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Wrong blood in tube (WBIT) is a critical pre-analytical error in laboratory medicine in which a blood sample is mislabeled with the wrong patient identity. These errors are often undetected due to the limitations of current detection strategies (e.g., delta checks). Methods: We evaluated Random Forest models for WBIT detection and conducted a detailed analyte importance analysis. In total, 799,721 samples from a German tertiary care center were analyzed and filtered for applicability. Model input features were derived by pairing consecutive same-patient samples for non-WBIT cases, simulating WBIT by pairing samples from different patients, and computing per-analyte first-order differences for each pair. We exhaustively searched all subsets of nine CBC analytes and evaluated models using F1 score, AUC, sensitivity, and PPV. Analyte importance was assessed via SHAP, permutation, and impurity decrease. Results: Models using as few as three analytes (MCV, RDW, MCH) reached F1 scores above 90%, with performance plateauing beyond six analytes. MCV and RDW were consistently top-ranked. Two-dimensional and three-dimensional visualizations revealed interpretable decision boundaries. Conclusions: Findings demonstrate that robust WBIT detection is achievable using a minimal subset of CBC analytes, offering a practical, interpretable, and broadly generalizable ML-based solution suitable for diverse clinical environments.\",\"PeriodicalId\":16722,\"journal\":{\"name\":\"Journal of Personalized Medicine\",\"volume\":\"15 9\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12470999/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Personalized Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/jpm15090404\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Personalized Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/jpm15090404","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

背景：错血管（WBIT）是检验医学中一种重要的分析前错误，即血液样本被错误地标记为错误的患者身份。由于当前检测策略的限制（例如，增量检查），这些错误通常无法被检测到。方法：我们评估了随机森林模型对WBIT的检测效果，并进行了详细的分析物重要性分析。总共分析了来自德国三级保健中心的799,721个样本，并对其适用性进行了过滤。模型输入特征是通过配对非WBIT病例的连续相同患者样本来获得的，通过配对来自不同患者的样本来模拟WBIT，并计算每对分析物的一阶差异。我们详尽地检索了9个CBC分析物的所有子集，并使用F1评分、AUC、敏感性和PPV评估模型。通过SHAP、排列和杂质减少来评估分析物的重要性。结果：仅使用3种分析物（MCV、RDW、MCH）的模型F1得分均在90%以上，超过6种分析物后性能趋于稳定。MCV和RDW一直名列前茅。二维和三维可视化揭示了可解释的决策边界。结论：研究结果表明，使用最小的CBC分析物子集可以实现稳健的WBIT检测，提供了适用于不同临床环境的实用、可解释和广泛推广的基于ml的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Analyte Importance Analysis in Machine Learning-Based Detection of Wrong-Blood-in-Tube Errors Using Complete Blood Count Data.

查看原文本刊更多论文

Analyte Importance Analysis in Machine Learning-Based Detection of Wrong-Blood-in-Tube Errors Using Complete Blood Count Data.

Background: Wrong blood in tube (WBIT) is a critical pre-analytical error in laboratory medicine in which a blood sample is mislabeled with the wrong patient identity. These errors are often undetected due to the limitations of current detection strategies (e.g., delta checks). Methods: We evaluated Random Forest models for WBIT detection and conducted a detailed analyte importance analysis. In total, 799,721 samples from a German tertiary care center were analyzed and filtered for applicability. Model input features were derived by pairing consecutive same-patient samples for non-WBIT cases, simulating WBIT by pairing samples from different patients, and computing per-analyte first-order differences for each pair. We exhaustively searched all subsets of nine CBC analytes and evaluated models using F1 score, AUC, sensitivity, and PPV. Analyte importance was assessed via SHAP, permutation, and impurity decrease. Results: Models using as few as three analytes (MCV, RDW, MCH) reached F1 scores above 90%, with performance plateauing beyond six analytes. MCV and RDW were consistently top-ranked. Two-dimensional and three-dimensional visualizations revealed interpretable decision boundaries. Conclusions: Findings demonstrate that robust WBIT detection is achievable using a minimal subset of CBC analytes, offering a practical, interpretable, and broadly generalizable ML-based solution suitable for diverse clinical environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Personalized Medicine Medicine-Medicine (miscellaneous)

CiteScore

4.10

自引率

0.00%

发文量

1878

审稿时长

11 weeks

期刊介绍： Journal of Personalized Medicine (JPM; ISSN 2075-4426) is an international, open access journal aimed at bringing all aspects of personalized medicine to one platform. JPM publishes cutting edge, innovative preclinical and translational scientific research and technologies related to personalized medicine (e.g., pharmacogenomics/proteomics, systems biology). JPM recognizes that personalized medicine—the assessment of genetic, environmental and host factors that cause variability of individuals—is a challenging, transdisciplinary topic that requires discussions from a range of experts. For a comprehensive perspective of personalized medicine, JPM aims to integrate expertise from the molecular and translational sciences, therapeutics and diagnostics, as well as discussions of regulatory, social, ethical and policy aspects. We provide a forum to bring together academic and clinical researchers, biotechnology, diagnostic and pharmaceutical companies, health professionals, regulatory and ethical experts, and government and regulatory authorities.