Impact of analytical bias on machine learning models for sepsis prediction using laboratory data.

IF 3.8 2区 医学 Q1 MEDICAL LABORATORY TECHNOLOGY
Meryem Rumeysa Yesil, Ilaria Talli, Michela Pelloso, Chiara Cosma, Elisa Pangrazzi, Mario Plebani, Yasemin Ustundag, Andrea Padoan
{"title":"Impact of analytical bias on machine learning models for sepsis prediction using laboratory data.","authors":"Meryem Rumeysa Yesil, Ilaria Talli, Michela Pelloso, Chiara Cosma, Elisa Pangrazzi, Mario Plebani, Yasemin Ustundag, Andrea Padoan","doi":"10.1515/cclm-2025-0491","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Machine learning (ML) models, using laboratory data, support early sepsis prediction. However, analytical bias in laboratory measurements can compromise their performance and validity in real-world settings. We aimed to evaluate how analytically acceptable bias may affect the validity and generalizability of ML models trained on laboratory data.</p><p><strong>Methods: </strong>A support vector machine model (SVM) for sepsis prediction was developed using complete blood count and erythrocyte sedimentation rate data from outpatients (CS, n=104) and patients from acute inflammatory status wards (SS, n=107). Twenty-six combinations were derived by white blood cells (WBC), platelets (PLT), and erythrocyte sedimentation rate (ESR) biases from analytical performance specifications (APS). The diagnostic performances of the 26 conditions tested were compared to the original dataset.</p><p><strong>Results: </strong>SVM performance of the original dataset was AUC 90.6 % [95 %CI: 80.6-98.7 %]. Minimum, desirable and optimum acceptable biases for WBC were 7.7 , 5.1 and 2.6 %, respectively, for PLT were 6.7 , 4.5 and 2.2 %, respectively and for ESR were 31.6 , 21.1 and 10.5 %, respectively. Across all conditions, AUC varied from 89.8 % [95 %CI: 79.0-97.7 %] (for PLT bias -6.7 %), to 89.5 % [95 %CI: 79.1-98.0 %] (for ESR Bias +31.6 %) to 90.4 % [95 %CI: 79.3-98.4 %] (for WBC Bias -5.1 %). Using a combination of biases, the lowest AUC was 87.8 % [95 %CI: 75.9-96.6 %]. No statistically significant differences were observed for AUC (p>0.05).</p><p><strong>Conclusions: </strong>Bias can influence model performance depending on the parameters and their combinations. Developing new validation strategies to assess the impact of analytical bias on laboratory data in ML models could improve their reliability.</p>","PeriodicalId":10390,"journal":{"name":"Clinical chemistry and laboratory medicine","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical chemistry and laboratory medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1515/cclm-2025-0491","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Machine learning (ML) models, using laboratory data, support early sepsis prediction. However, analytical bias in laboratory measurements can compromise their performance and validity in real-world settings. We aimed to evaluate how analytically acceptable bias may affect the validity and generalizability of ML models trained on laboratory data.

Methods: A support vector machine model (SVM) for sepsis prediction was developed using complete blood count and erythrocyte sedimentation rate data from outpatients (CS, n=104) and patients from acute inflammatory status wards (SS, n=107). Twenty-six combinations were derived by white blood cells (WBC), platelets (PLT), and erythrocyte sedimentation rate (ESR) biases from analytical performance specifications (APS). The diagnostic performances of the 26 conditions tested were compared to the original dataset.

Results: SVM performance of the original dataset was AUC 90.6 % [95 %CI: 80.6-98.7 %]. Minimum, desirable and optimum acceptable biases for WBC were 7.7 , 5.1 and 2.6 %, respectively, for PLT were 6.7 , 4.5 and 2.2 %, respectively and for ESR were 31.6 , 21.1 and 10.5 %, respectively. Across all conditions, AUC varied from 89.8 % [95 %CI: 79.0-97.7 %] (for PLT bias -6.7 %), to 89.5 % [95 %CI: 79.1-98.0 %] (for ESR Bias +31.6 %) to 90.4 % [95 %CI: 79.3-98.4 %] (for WBC Bias -5.1 %). Using a combination of biases, the lowest AUC was 87.8 % [95 %CI: 75.9-96.6 %]. No statistically significant differences were observed for AUC (p>0.05).

Conclusions: Bias can influence model performance depending on the parameters and their combinations. Developing new validation strategies to assess the impact of analytical bias on laboratory data in ML models could improve their reliability.

分析偏差对脓毒症实验室数据预测机器学习模型的影响。
目的:机器学习(ML)模型,利用实验室数据,支持早期败血症预测。然而,实验室测量中的分析偏差可能会损害其在现实世界中的性能和有效性。我们的目的是评估分析上可接受的偏差如何影响实验室数据训练的ML模型的有效性和泛化性。方法:利用门诊患者(CS, n=104)和急性炎症病房患者(SS, n=107)的全血细胞计数和红细胞沉降率数据,建立脓毒症预测的支持向量机模型(SVM)。根据分析性能规范(APS)的白细胞(WBC)、血小板(PLT)和红细胞沉降率(ESR)偏差得出26种组合。将测试的26种条件的诊断性能与原始数据集进行比较。结果:原始数据集的SVM性能AUC为90.6 %[95 %CI: 80.6-98.7 %]。WBC的最小、理想和最佳可接受偏差分别为7.7 、5.1和2.6 %,PLT的最小、理想和最佳可接受偏差分别为6.7 、4.5和2.2 %,ESR的最小、理想和最佳可接受偏差分别为31.6 、21.1和10.5 %。在所有条件下,AUC变化从89.8 %(95 %置信区间:79.0—-97.7 %](PLT偏见的-6.7 %),89.5 %(95 %置信区间:79.1—-98.0 %)+ 31.6 (ESR偏见 %)90.4 %(95 %置信区间:79.3—-98.4 %](WBC偏见的-5.1 %)。综合偏倚,最低AUC为87.8 %[95 %CI: 75.9-96.6 %]。AUC差异无统计学意义(p < 0.05)。结论:偏差会影响模型的性能,这取决于参数和它们的组合。开发新的验证策略来评估分析偏差对ML模型中实验室数据的影响,可以提高其可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Clinical chemistry and laboratory medicine
Clinical chemistry and laboratory medicine 医学-医学实验技术
CiteScore
11.30
自引率
16.20%
发文量
306
审稿时长
3 months
期刊介绍: Clinical Chemistry and Laboratory Medicine (CCLM) publishes articles on novel teaching and training methods applicable to laboratory medicine. CCLM welcomes contributions on the progress in fundamental and applied research and cutting-edge clinical laboratory medicine. It is one of the leading journals in the field, with an impact factor over 3. CCLM is issued monthly, and it is published in print and electronically. CCLM is the official journal of the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) and publishes regularly EFLM recommendations and news. CCLM is the official journal of the National Societies from Austria (ÖGLMKC); Belgium (RBSLM); Germany (DGKL); Hungary (MLDT); Ireland (ACBI); Italy (SIBioC); Portugal (SPML); and Slovenia (SZKK); and it is affiliated to AACB (Australia) and SFBC (France). Topics: - clinical biochemistry - clinical genomics and molecular biology - clinical haematology and coagulation - clinical immunology and autoimmunity - clinical microbiology - drug monitoring and analysis - evaluation of diagnostic biomarkers - disease-oriented topics (cardiovascular disease, cancer diagnostics, diabetes) - new reagents, instrumentation and technologies - new methodologies - reference materials and methods - reference values and decision limits - quality and safety in laboratory medicine - translational laboratory medicine - clinical metrology Follow @cclm_degruyter on Twitter!
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信