Meryem Rumeysa Yesil, Ilaria Talli, Michela Pelloso, Chiara Cosma, Elisa Pangrazzi, Mario Plebani, Yasemin Ustundag, Andrea Padoan
{"title":"Impact of analytical bias on machine learning models for sepsis prediction using laboratory data.","authors":"Meryem Rumeysa Yesil, Ilaria Talli, Michela Pelloso, Chiara Cosma, Elisa Pangrazzi, Mario Plebani, Yasemin Ustundag, Andrea Padoan","doi":"10.1515/cclm-2025-0491","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Machine learning (ML) models, using laboratory data, support early sepsis prediction. However, analytical bias in laboratory measurements can compromise their performance and validity in real-world settings. We aimed to evaluate how analytically acceptable bias may affect the validity and generalizability of ML models trained on laboratory data.</p><p><strong>Methods: </strong>A support vector machine model (SVM) for sepsis prediction was developed using complete blood count and erythrocyte sedimentation rate data from outpatients (CS, n=104) and patients from acute inflammatory status wards (SS, n=107). Twenty-six combinations were derived by white blood cells (WBC), platelets (PLT), and erythrocyte sedimentation rate (ESR) biases from analytical performance specifications (APS). The diagnostic performances of the 26 conditions tested were compared to the original dataset.</p><p><strong>Results: </strong>SVM performance of the original dataset was AUC 90.6 % [95 %CI: 80.6-98.7 %]. Minimum, desirable and optimum acceptable biases for WBC were 7.7 , 5.1 and 2.6 %, respectively, for PLT were 6.7 , 4.5 and 2.2 %, respectively and for ESR were 31.6 , 21.1 and 10.5 %, respectively. Across all conditions, AUC varied from 89.8 % [95 %CI: 79.0-97.7 %] (for PLT bias -6.7 %), to 89.5 % [95 %CI: 79.1-98.0 %] (for ESR Bias +31.6 %) to 90.4 % [95 %CI: 79.3-98.4 %] (for WBC Bias -5.1 %). Using a combination of biases, the lowest AUC was 87.8 % [95 %CI: 75.9-96.6 %]. No statistically significant differences were observed for AUC (p>0.05).</p><p><strong>Conclusions: </strong>Bias can influence model performance depending on the parameters and their combinations. Developing new validation strategies to assess the impact of analytical bias on laboratory data in ML models could improve their reliability.</p>","PeriodicalId":10390,"journal":{"name":"Clinical chemistry and laboratory medicine","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical chemistry and laboratory medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1515/cclm-2025-0491","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Machine learning (ML) models, using laboratory data, support early sepsis prediction. However, analytical bias in laboratory measurements can compromise their performance and validity in real-world settings. We aimed to evaluate how analytically acceptable bias may affect the validity and generalizability of ML models trained on laboratory data.
Methods: A support vector machine model (SVM) for sepsis prediction was developed using complete blood count and erythrocyte sedimentation rate data from outpatients (CS, n=104) and patients from acute inflammatory status wards (SS, n=107). Twenty-six combinations were derived by white blood cells (WBC), platelets (PLT), and erythrocyte sedimentation rate (ESR) biases from analytical performance specifications (APS). The diagnostic performances of the 26 conditions tested were compared to the original dataset.
Results: SVM performance of the original dataset was AUC 90.6 % [95 %CI: 80.6-98.7 %]. Minimum, desirable and optimum acceptable biases for WBC were 7.7 , 5.1 and 2.6 %, respectively, for PLT were 6.7 , 4.5 and 2.2 %, respectively and for ESR were 31.6 , 21.1 and 10.5 %, respectively. Across all conditions, AUC varied from 89.8 % [95 %CI: 79.0-97.7 %] (for PLT bias -6.7 %), to 89.5 % [95 %CI: 79.1-98.0 %] (for ESR Bias +31.6 %) to 90.4 % [95 %CI: 79.3-98.4 %] (for WBC Bias -5.1 %). Using a combination of biases, the lowest AUC was 87.8 % [95 %CI: 75.9-96.6 %]. No statistically significant differences were observed for AUC (p>0.05).
Conclusions: Bias can influence model performance depending on the parameters and their combinations. Developing new validation strategies to assess the impact of analytical bias on laboratory data in ML models could improve their reliability.
期刊介绍:
Clinical Chemistry and Laboratory Medicine (CCLM) publishes articles on novel teaching and training methods applicable to laboratory medicine. CCLM welcomes contributions on the progress in fundamental and applied research and cutting-edge clinical laboratory medicine. It is one of the leading journals in the field, with an impact factor over 3. CCLM is issued monthly, and it is published in print and electronically.
CCLM is the official journal of the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) and publishes regularly EFLM recommendations and news. CCLM is the official journal of the National Societies from Austria (ÖGLMKC); Belgium (RBSLM); Germany (DGKL); Hungary (MLDT); Ireland (ACBI); Italy (SIBioC); Portugal (SPML); and Slovenia (SZKK); and it is affiliated to AACB (Australia) and SFBC (France).
Topics:
- clinical biochemistry
- clinical genomics and molecular biology
- clinical haematology and coagulation
- clinical immunology and autoimmunity
- clinical microbiology
- drug monitoring and analysis
- evaluation of diagnostic biomarkers
- disease-oriented topics (cardiovascular disease, cancer diagnostics, diabetes)
- new reagents, instrumentation and technologies
- new methodologies
- reference materials and methods
- reference values and decision limits
- quality and safety in laboratory medicine
- translational laboratory medicine
- clinical metrology
Follow @cclm_degruyter on Twitter!