Unraveling Uncertainty: The Impact of Biological and Analytical Variation on the Prediction Uncertainty of Categorical Prediction Models.

IF 1.8 Q3 MEDICAL LABORATORY TECHNOLOGY

Journal of Applied Laboratory Medicine Pub Date : 2025-03-03 DOI:10.1093/jalm/jfae115

Remy J H Martens, William P T M van Doorn, Mathie P G Leers, Steven J R Meex, Floris Helmich

{"title":"Unraveling Uncertainty: The Impact of Biological and Analytical Variation on the Prediction Uncertainty of Categorical Prediction Models.","authors":"Remy J H Martens, William P T M van Doorn, Mathie P G Leers, Steven J R Meex, Floris Helmich","doi":"10.1093/jalm/jfae115","DOIUrl":null,"url":null,"abstract":"Background: Interest in prediction models, including machine learning (ML) models, based on laboratory data has increased tremendously. Uncertainty in laboratory measurements and predictions based on such data are inherently intertwined. This study developed a framework for assessing the impact of biological and analytical variation on the prediction uncertainty of categorical prediction models.Methods: Practical application was demonstrated for the prediction of renal function loss (Chronic Kidney Disease Epidemiology Collaboration [CKD-EPI] equation) and 31-day mortality (advanced ML model) in 6360 emergency department patients. Model outcome was calculated in 100 000 simulations of variation in laboratory parameters. Subsequently, the percentage of discordant predictions was calculated with the original prediction as reference. Simulations were repeated assuming increasing levels of analytical variation.Results: For the ML model, area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity were 0.90, 0.44, and 0.96, respectively. At base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-28.8%]. In addition, 7.2% of patients had >5% discordant predictions. At 6× base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-38.8%]. In addition, 11.7% of patients had >5% discordant predictions. However, the impact of analytical variation was limited compared with biological variation. AUROC, sensitivity, and specificity were not affected by variation in laboratory parameters.Conclusions: The impact of biological and analytical variation on the prediction uncertainty of categorical prediction models, including ML models, can be estimated by the occurrence of discordant predictions in a simulation model. Nevertheless, discordant predictions at the individual level do not necessarily affect model performance at the population level.","PeriodicalId":46361,"journal":{"name":"Journal of Applied Laboratory Medicine","volume":" ","pages":"339-351"},"PeriodicalIF":1.8000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Laboratory Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jalm/jfae115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Interest in prediction models, including machine learning (ML) models, based on laboratory data has increased tremendously. Uncertainty in laboratory measurements and predictions based on such data are inherently intertwined. This study developed a framework for assessing the impact of biological and analytical variation on the prediction uncertainty of categorical prediction models.

Methods: Practical application was demonstrated for the prediction of renal function loss (Chronic Kidney Disease Epidemiology Collaboration [CKD-EPI] equation) and 31-day mortality (advanced ML model) in 6360 emergency department patients. Model outcome was calculated in 100 000 simulations of variation in laboratory parameters. Subsequently, the percentage of discordant predictions was calculated with the original prediction as reference. Simulations were repeated assuming increasing levels of analytical variation.

Results: For the ML model, area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity were 0.90, 0.44, and 0.96, respectively. At base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-28.8%]. In addition, 7.2% of patients had >5% discordant predictions. At 6× base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-38.8%]. In addition, 11.7% of patients had >5% discordant predictions. However, the impact of analytical variation was limited compared with biological variation. AUROC, sensitivity, and specificity were not affected by variation in laboratory parameters.

Conclusions: The impact of biological and analytical variation on the prediction uncertainty of categorical prediction models, including ML models, can be estimated by the occurrence of discordant predictions in a simulation model. Nevertheless, discordant predictions at the individual level do not necessarily affect model performance at the population level.

查看原文本刊更多论文

揭示不确定性：生物和分析变异对分类预测模型预测不确定性的影响。

背景：人们对基于实验室数据的预测模型（包括机器学习（ML）模型）的兴趣与日俱增。实验室测量的不确定性和基于这些数据的预测在本质上是相互交织的。本研究开发了一个框架，用于评估生物和分析变异对分类预测模型预测不确定性的影响：方法：对 6360 名急诊科患者的肾功能丧失（慢性肾病流行病学协作组 [CKD-EPI] 方程）和 31 天死亡率（高级 ML 模型）的预测进行了实际应用演示。模型结果是在 100 000 次实验室参数变化模拟中计算得出的。随后，以原始预测作为参考，计算不一致预测的百分比。假设分析变异水平不断增加，则重复进行模拟：对于 ML 模型，接收者工作特征曲线下面积（AUROC）、灵敏度和特异性分别为 0.90、0.44 和 0.96。在分析变异的基础上，不一致预测百分比的中位数[2.5-97.5 百分位数]为 0% [0%-28.8%]。此外，7.2% 的患者预测不一致的比例大于 5%。在 6 倍基数分析变异时，不一致预测百分比的中位数[2.5-97.5 百分位数]为 0% [0%-38.8%]。此外，11.7% 的患者预测结果不一致的比例大于 5%。不过，与生物变异相比，分析变异的影响有限。AUROC、灵敏度和特异性不受实验室参数变化的影响：结论：生物和分析变异对分类预测模型（包括 ML 模型）预测不确定性的影响可以通过模拟模型中出现的不一致预测来估算。然而，个体水平上的不一致预测并不一定会影响模型在群体水平上的表现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Applied Laboratory Medicine MEDICAL LABORATORY TECHNOLOGY-

CiteScore

3.70

自引率

5.00%

发文量

137