基于标准化EuroFlow流式细胞术免疫表型数据的成熟/外周b细胞肿瘤自动诊断分类的五种基于模式的方法比较

IF 7 2区医学 Q1 BIOLOGY

Computers in biology and medicine Pub Date : 2025-04-28 DOI:10.1016/j.compbiomed.2025.110194

C.E. Pedreira , Q. Lecrevisse , R. Fluxa , J. Verde , S. Barrena , J. Flores-Montero , P. Fernandez , D. Morf , V.H.J. van der Velden , E. Mejstrikova , J. Caetano , L. Burgos , S. Böttcher , J.J.M. van Dongen , A. Orfao , EuroFlow

{"title":"基于标准化EuroFlow流式细胞术免疫表型数据的成熟/外周b细胞肿瘤自动诊断分类的五种基于模式的方法比较","authors":"C.E. Pedreira , Q. Lecrevisse , R. Fluxa , J. Verde , S. Barrena , J. Flores-Montero , P. Fernandez , D. Morf , V.H.J. van der Velden , E. Mejstrikova , J. Caetano , L. Burgos , S. Böttcher , J.J.M. van Dongen , A. Orfao , EuroFlow","doi":"10.1016/j.compbiomed.2025.110194","DOIUrl":null,"url":null,"abstract":"<div><div>Flow cytometry immunophenotyping is critical for the diagnostic classification of mature/peripheral B-cell neoplasms/B-cell chronic lymphoproliferative disorders (B-CLPD). Quantitative driven classification approaches applied to multiparameter flow cytometry immunophenotypic data can be used to extract maximum information from a multidimensional space created by individual parameters (e.g., immunophenotypic markers), for highly accurate and automated classification of individual patient (sample) data. Here, we developed and compared five diagnostic classification algorithms, based on a large set of EuroFlow multicentric flow cytometry data files from a cohort 659 B-CLPD patients. These included automatic population separators based on Principal Component Analysis (PCA), Canonical Variate Analysis (CVA), Neighbourhood Component Analysis (NCA), Support Vector Machine algorithms (SVM) and a variant of the CA(Canonical Analysis) algorithm, in which the number of SDs (Standard Deviations) varied for each of the comparisons of different pairs of diseases (<em>CA-vSD</em>). All five classification approaches are based on direct prospective interrogation of individual B-CLPD patients against the EuroFlow flow cytometry B-CLPD database composed of tumor B-cells of 659 individual patients stained in an identical way and classified a priori by the World Health Organization (WHO) criteria into nine diagnostic categories. Each classification approach was evaluated in parallel in terms of accuracy (% properly classified cases), precision (multiple or single diagnosis/case) and coverage (% cases with a proposed diagnosis). Overall, average rates of correct diagnosis (for the nine B-CLPD diagnostic entities) of between 58.9 % and 90.6 % were obtained with the five algorithms, with variable percentages of cases being either misclassified (4.1 %–14.0 %) or unclassifiable (0.3 %–37.0 %). Automatic population separators based on CA, SVM and PCA showed a high average level of correctness (90.6 %, 86.8 %, and 86.0 %, respectively). Nevertheless, this was at the expense of proposing a considerable number of multiple diagnoses for a significant proportion of the test cases (54.5 %, 53.5 %, and 49.6 %, respectively). The <em>CA-vSD</em> algorithm generated the smaller average misclassification rate (4.1 %), but with 37.0 % of cases for which no diagnosis was proposed. In contrast, the NCA algorithm left only 2.7 % of cases without an associated diagnosis but misclassified 14.0 %. Among correctly classified cases (83.3 % of total), 91.2 % had a single proposed diagnosis, 8.6 % had two possible diagnoses, and 0.2 % had three. We demonstrate that the proposed AI algorithms provide an acceptable level of accuracy for the diagnostic classification of B-CLPD patients and, in general, surpass other algorithms reported in the literature.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"192 ","pages":"Article 110194"},"PeriodicalIF":7.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparison between five pattern-based approaches for automated diagnostic classification of mature/peripheral B-cell neoplasms based on standardized EuroFlow flow cytometry immunophenotypic data\",\"authors\":\"C.E. Pedreira , Q. Lecrevisse , R. Fluxa , J. Verde , S. Barrena , J. Flores-Montero , P. Fernandez , D. Morf , V.H.J. van der Velden , E. Mejstrikova , J. Caetano , L. Burgos , S. Böttcher , J.J.M. van Dongen , A. Orfao , EuroFlow\",\"doi\":\"10.1016/j.compbiomed.2025.110194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Flow cytometry immunophenotyping is critical for the diagnostic classification of mature/peripheral B-cell neoplasms/B-cell chronic lymphoproliferative disorders (B-CLPD). Quantitative driven classification approaches applied to multiparameter flow cytometry immunophenotypic data can be used to extract maximum information from a multidimensional space created by individual parameters (e.g., immunophenotypic markers), for highly accurate and automated classification of individual patient (sample) data. Here, we developed and compared five diagnostic classification algorithms, based on a large set of EuroFlow multicentric flow cytometry data files from a cohort 659 B-CLPD patients. These included automatic population separators based on Principal Component Analysis (PCA), Canonical Variate Analysis (CVA), Neighbourhood Component Analysis (NCA), Support Vector Machine algorithms (SVM) and a variant of the CA(Canonical Analysis) algorithm, in which the number of SDs (Standard Deviations) varied for each of the comparisons of different pairs of diseases (<em>CA-vSD</em>). All five classification approaches are based on direct prospective interrogation of individual B-CLPD patients against the EuroFlow flow cytometry B-CLPD database composed of tumor B-cells of 659 individual patients stained in an identical way and classified a priori by the World Health Organization (WHO) criteria into nine diagnostic categories. Each classification approach was evaluated in parallel in terms of accuracy (% properly classified cases), precision (multiple or single diagnosis/case) and coverage (% cases with a proposed diagnosis). Overall, average rates of correct diagnosis (for the nine B-CLPD diagnostic entities) of between 58.9 % and 90.6 % were obtained with the five algorithms, with variable percentages of cases being either misclassified (4.1 %–14.0 %) or unclassifiable (0.3 %–37.0 %). Automatic population separators based on CA, SVM and PCA showed a high average level of correctness (90.6 %, 86.8 %, and 86.0 %, respectively). Nevertheless, this was at the expense of proposing a considerable number of multiple diagnoses for a significant proportion of the test cases (54.5 %, 53.5 %, and 49.6 %, respectively). The <em>CA-vSD</em> algorithm generated the smaller average misclassification rate (4.1 %), but with 37.0 % of cases for which no diagnosis was proposed. In contrast, the NCA algorithm left only 2.7 % of cases without an associated diagnosis but misclassified 14.0 %. Among correctly classified cases (83.3 % of total), 91.2 % had a single proposed diagnosis, 8.6 % had two possible diagnoses, and 0.2 % had three. We demonstrate that the proposed AI algorithms provide an acceptable level of accuracy for the diagnostic classification of B-CLPD patients and, in general, surpass other algorithms reported in the literature.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"192 \",\"pages\":\"Article 110194\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525005451\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525005451","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

流式细胞术免疫分型对成熟/外周b细胞肿瘤/ b细胞慢性淋巴细胞增生性疾病（B-CLPD）的诊断分类至关重要。应用于多参数流式细胞术免疫表型数据的定量驱动分类方法可用于从单个参数（例如，免疫表型标记物）创建的多维空间中提取最大信息，以实现个体患者（样本）数据的高度准确和自动分类。本研究基于659例B-CLPD患者的EuroFlow多中心流式细胞术数据文件，开发并比较了五种诊断分类算法。其中包括基于主成分分析（PCA）、典型变量分析（CVA）、邻域成分分析（NCA）、支持向量机算法（SVM）和CA（典型分析）算法的自动总体分离器，其中每种不同疾病对的比较（CA- vsd）的SDs（标准差）数量各不相同。所有五种分类方法都是基于对单个B-CLPD患者对EuroFlow流式细胞术B-CLPD数据库的直接前瞻性查询，该数据库由659个个体患者的肿瘤b细胞组成，以相同的方式染色，并根据世界卫生组织（WHO）标准先验地分为九个诊断类别。每种分类方法在准确性（正确分类病例的百分比），精度（多个或单个诊断/病例）和覆盖率（建议诊断的病例百分比）方面进行并行评估。总体而言，五种算法的平均正确诊断率（对于9个B-CLPD诊断实体）在58.9%至90.6%之间，不同比例的病例被错误分类（4.1% - 14.0%）或无法分类（0.3% - 37.0%）。基于CA、SVM和PCA的自动种群分隔符的平均正确率分别为90.6%、86.8%和86.0%。然而，这是以对相当大比例的检测病例（分别为54.5%、53.5%和49.6%）提出相当数量的多重诊断为代价的。CA-vSD算法产生的平均误分类率较小（4.1%），但有37.0%的病例没有提出诊断。相比之下，NCA算法只留下2.7%的病例没有相关的诊断，但错误分类14.0%。在正确分类的病例中（83.3%），91.2%有单一可能诊断，8.6%有两种可能诊断，0.2%有三种可能诊断。我们证明，所提出的人工智能算法为B-CLPD患者的诊断分类提供了可接受的准确性水平，并且总体上优于文献中报道的其他算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparison between five pattern-based approaches for automated diagnostic classification of mature/peripheral B-cell neoplasms based on standardized EuroFlow flow cytometry immunophenotypic data

Flow cytometry immunophenotyping is critical for the diagnostic classification of mature/peripheral B-cell neoplasms/B-cell chronic lymphoproliferative disorders (B-CLPD). Quantitative driven classification approaches applied to multiparameter flow cytometry immunophenotypic data can be used to extract maximum information from a multidimensional space created by individual parameters (e.g., immunophenotypic markers), for highly accurate and automated classification of individual patient (sample) data. Here, we developed and compared five diagnostic classification algorithms, based on a large set of EuroFlow multicentric flow cytometry data files from a cohort 659 B-CLPD patients. These included automatic population separators based on Principal Component Analysis (PCA), Canonical Variate Analysis (CVA), Neighbourhood Component Analysis (NCA), Support Vector Machine algorithms (SVM) and a variant of the CA(Canonical Analysis) algorithm, in which the number of SDs (Standard Deviations) varied for each of the comparisons of different pairs of diseases (CA-vSD). All five classification approaches are based on direct prospective interrogation of individual B-CLPD patients against the EuroFlow flow cytometry B-CLPD database composed of tumor B-cells of 659 individual patients stained in an identical way and classified a priori by the World Health Organization (WHO) criteria into nine diagnostic categories. Each classification approach was evaluated in parallel in terms of accuracy (% properly classified cases), precision (multiple or single diagnosis/case) and coverage (% cases with a proposed diagnosis). Overall, average rates of correct diagnosis (for the nine B-CLPD diagnostic entities) of between 58.9 % and 90.6 % were obtained with the five algorithms, with variable percentages of cases being either misclassified (4.1 %–14.0 %) or unclassifiable (0.3 %–37.0 %). Automatic population separators based on CA, SVM and PCA showed a high average level of correctness (90.6 %, 86.8 %, and 86.0 %, respectively). Nevertheless, this was at the expense of proposing a considerable number of multiple diagnoses for a significant proportion of the test cases (54.5 %, 53.5 %, and 49.6 %, respectively). The CA-vSD algorithm generated the smaller average misclassification rate (4.1 %), but with 37.0 % of cases for which no diagnosis was proposed. In contrast, the NCA algorithm left only 2.7 % of cases without an associated diagnosis but misclassified 14.0 %. Among correctly classified cases (83.3 % of total), 91.2 % had a single proposed diagnosis, 8.6 % had two possible diagnoses, and 0.2 % had three. We demonstrate that the proposed AI algorithms provide an acceptable level of accuracy for the diagnostic classification of B-CLPD patients and, in general, surpass other algorithms reported in the literature.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.