How is test laboratory data used and characterised by machine learning models? A systematic review of diagnostic and prognostic models developed for COVID-19 patients using only laboratory data

Clinical Chemistry and Laboratory Medicine (CCLM) Pub Date : 2022-05-05 DOI:10.1515/cclm-2022-0182

A. Carobene, Frida Milella, Lorenzo Famiglini, F. Cabitza

{"title":"How is test laboratory data used and characterised by machine learning models? A systematic review of diagnostic and prognostic models developed for COVID-19 patients using only laboratory data","authors":"A. Carobene, Frida Milella, Lorenzo Famiglini, F. Cabitza","doi":"10.1515/cclm-2022-0182","DOIUrl":null,"url":null,"abstract":"Abstract The current gold standard for COVID-19 diagnosis, the rRT-PCR test, is hampered by long turnaround times, probable reagent shortages, high false-negative rates and high prices. As a result, machine learning (ML) methods have recently piqued interest, particularly when applied to digital imagery (X-rays and CT scans). In this review, the literature on ML-based diagnostic and prognostic studies grounded on hematochemical parameters has been considered. By doing so, a gap in the current literature was addressed concerning the application of machine learning to laboratory medicine. Sixty-eight articles have been included that were extracted from the Scopus and PubMed indexes. These studies were marked by a great deal of heterogeneity in terms of the examined laboratory test and clinical parameters, sample size, reference populations, ML algorithms, and validation approaches. The majority of research was found to be hampered by reporting and replicability issues: only four of the surveyed studies provided complete information on analytic procedures (units of measure, analyzing equipment), while 29 provided no information at all. Only 16 studies included independent external validation. In light of these findings, we discuss the importance of closer collaboration between data scientists and medical laboratory professionals in order to correctly characterise the relevant population, select the most appropriate statistical and analytical methods, ensure reproducibility, enable the proper interpretation of the results, and gain actual utility by using machine learning methods in clinical practice.","PeriodicalId":10388,"journal":{"name":"Clinical Chemistry and Laboratory Medicine (CCLM)","volume":"19 1","pages":"1887 - 1901"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Chemistry and Laboratory Medicine (CCLM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/cclm-2022-0182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

Abstract The current gold standard for COVID-19 diagnosis, the rRT-PCR test, is hampered by long turnaround times, probable reagent shortages, high false-negative rates and high prices. As a result, machine learning (ML) methods have recently piqued interest, particularly when applied to digital imagery (X-rays and CT scans). In this review, the literature on ML-based diagnostic and prognostic studies grounded on hematochemical parameters has been considered. By doing so, a gap in the current literature was addressed concerning the application of machine learning to laboratory medicine. Sixty-eight articles have been included that were extracted from the Scopus and PubMed indexes. These studies were marked by a great deal of heterogeneity in terms of the examined laboratory test and clinical parameters, sample size, reference populations, ML algorithms, and validation approaches. The majority of research was found to be hampered by reporting and replicability issues: only four of the surveyed studies provided complete information on analytic procedures (units of measure, analyzing equipment), while 29 provided no information at all. Only 16 studies included independent external validation. In light of these findings, we discuss the importance of closer collaboration between data scientists and medical laboratory professionals in order to correctly characterise the relevant population, select the most appropriate statistical and analytical methods, ensure reproducibility, enable the proper interpretation of the results, and gain actual utility by using machine learning methods in clinical practice.

查看原文本刊更多论文

机器学习模型如何使用和表征测试实验室数据?系统回顾仅使用实验室数据为COVID-19患者开发的诊断和预后模型

当前新冠肺炎诊断的金标准rRT-PCR检测存在周转时间长、试剂短缺、假阴性率高和价格高等问题。因此，机器学习(ML)方法最近引起了人们的兴趣，特别是在应用于数字图像(x射线和CT扫描)时。在这篇综述中，考虑了基于血液化学参数的ml诊断和预后研究的文献。通过这样做，解决了当前文献中关于机器学习在实验室医学中的应用的空白。从Scopus和PubMed索引中提取的68篇文章已被纳入。这些研究的特点是在实验室检查和临床参数、样本量、参考人群、ML算法和验证方法方面存在很大的异质性。发现大多数研究受到报告和可复制性问题的阻碍:在接受调查的研究中，只有四项研究提供了关于分析程序(计量单位、分析设备)的完整资料，而29项研究根本没有提供资料。只有16项研究包括独立的外部验证。根据这些发现，我们讨论了数据科学家和医学实验室专业人员之间更密切合作的重要性，以便正确地描述相关人群，选择最合适的统计和分析方法，确保可重复性，使结果能够正确解释，并通过在临床实践中使用机器学习方法获得实际效用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Clinical Chemistry and Laboratory Medicine (CCLM)

自引率

0.00%

发文量