C.E. Pedreira , Q. Lecrevisse , R. Fluxa , J. Verde , S. Barrena , J. Flores-Montero , P. Fernandez , D. Morf , V.H.J. van der Velden , E. Mejstrikova , J. Caetano , L. Burgos , S. Böttcher , J.J.M. van Dongen , A. Orfao , EuroFlow
{"title":"基于标准化EuroFlow流式细胞术免疫表型数据的成熟/外周b细胞肿瘤自动诊断分类的五种基于模式的方法比较","authors":"C.E. Pedreira , Q. Lecrevisse , R. Fluxa , J. Verde , S. Barrena , J. Flores-Montero , P. Fernandez , D. Morf , V.H.J. van der Velden , E. Mejstrikova , J. Caetano , L. Burgos , S. Böttcher , J.J.M. van Dongen , A. Orfao , EuroFlow","doi":"10.1016/j.compbiomed.2025.110194","DOIUrl":null,"url":null,"abstract":"<div><div>Flow cytometry immunophenotyping is critical for the diagnostic classification of mature/peripheral B-cell neoplasms/B-cell chronic lymphoproliferative disorders (B-CLPD). Quantitative driven classification approaches applied to multiparameter flow cytometry immunophenotypic data can be used to extract maximum information from a multidimensional space created by individual parameters (e.g., immunophenotypic markers), for highly accurate and automated classification of individual patient (sample) data. Here, we developed and compared five diagnostic classification algorithms, based on a large set of EuroFlow multicentric flow cytometry data files from a cohort 659 B-CLPD patients. These included automatic population separators based on Principal Component Analysis (PCA), Canonical Variate Analysis (CVA), Neighbourhood Component Analysis (NCA), Support Vector Machine algorithms (SVM) and a variant of the CA(Canonical Analysis) algorithm, in which the number of SDs (Standard Deviations) varied for each of the comparisons of different pairs of diseases (<em>CA-vSD</em>). All five classification approaches are based on direct prospective interrogation of individual B-CLPD patients against the EuroFlow flow cytometry B-CLPD database composed of tumor B-cells of 659 individual patients stained in an identical way and classified a priori by the World Health Organization (WHO) criteria into nine diagnostic categories. Each classification approach was evaluated in parallel in terms of accuracy (% properly classified cases), precision (multiple or single diagnosis/case) and coverage (% cases with a proposed diagnosis). Overall, average rates of correct diagnosis (for the nine B-CLPD diagnostic entities) of between 58.9 % and 90.6 % were obtained with the five algorithms, with variable percentages of cases being either misclassified (4.1 %–14.0 %) or unclassifiable (0.3 %–37.0 %). Automatic population separators based on CA, SVM and PCA showed a high average level of correctness (90.6 %, 86.8 %, and 86.0 %, respectively). Nevertheless, this was at the expense of proposing a considerable number of multiple diagnoses for a significant proportion of the test cases (54.5 %, 53.5 %, and 49.6 %, respectively). The <em>CA-vSD</em> algorithm generated the smaller average misclassification rate (4.1 %), but with 37.0 % of cases for which no diagnosis was proposed. In contrast, the NCA algorithm left only 2.7 % of cases without an associated diagnosis but misclassified 14.0 %. Among correctly classified cases (83.3 % of total), 91.2 % had a single proposed diagnosis, 8.6 % had two possible diagnoses, and 0.2 % had three. We demonstrate that the proposed AI algorithms provide an acceptable level of accuracy for the diagnostic classification of B-CLPD patients and, in general, surpass other algorithms reported in the literature.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"192 ","pages":"Article 110194"},"PeriodicalIF":7.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparison between five pattern-based approaches for automated diagnostic classification of mature/peripheral B-cell neoplasms based on standardized EuroFlow flow cytometry immunophenotypic data\",\"authors\":\"C.E. Pedreira , Q. Lecrevisse , R. Fluxa , J. Verde , S. Barrena , J. Flores-Montero , P. Fernandez , D. Morf , V.H.J. van der Velden , E. Mejstrikova , J. Caetano , L. Burgos , S. Böttcher , J.J.M. van Dongen , A. Orfao , EuroFlow\",\"doi\":\"10.1016/j.compbiomed.2025.110194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Flow cytometry immunophenotyping is critical for the diagnostic classification of mature/peripheral B-cell neoplasms/B-cell chronic lymphoproliferative disorders (B-CLPD). Quantitative driven classification approaches applied to multiparameter flow cytometry immunophenotypic data can be used to extract maximum information from a multidimensional space created by individual parameters (e.g., immunophenotypic markers), for highly accurate and automated classification of individual patient (sample) data. Here, we developed and compared five diagnostic classification algorithms, based on a large set of EuroFlow multicentric flow cytometry data files from a cohort 659 B-CLPD patients. These included automatic population separators based on Principal Component Analysis (PCA), Canonical Variate Analysis (CVA), Neighbourhood Component Analysis (NCA), Support Vector Machine algorithms (SVM) and a variant of the CA(Canonical Analysis) algorithm, in which the number of SDs (Standard Deviations) varied for each of the comparisons of different pairs of diseases (<em>CA-vSD</em>). All five classification approaches are based on direct prospective interrogation of individual B-CLPD patients against the EuroFlow flow cytometry B-CLPD database composed of tumor B-cells of 659 individual patients stained in an identical way and classified a priori by the World Health Organization (WHO) criteria into nine diagnostic categories. Each classification approach was evaluated in parallel in terms of accuracy (% properly classified cases), precision (multiple or single diagnosis/case) and coverage (% cases with a proposed diagnosis). Overall, average rates of correct diagnosis (for the nine B-CLPD diagnostic entities) of between 58.9 % and 90.6 % were obtained with the five algorithms, with variable percentages of cases being either misclassified (4.1 %–14.0 %) or unclassifiable (0.3 %–37.0 %). Automatic population separators based on CA, SVM and PCA showed a high average level of correctness (90.6 %, 86.8 %, and 86.0 %, respectively). Nevertheless, this was at the expense of proposing a considerable number of multiple diagnoses for a significant proportion of the test cases (54.5 %, 53.5 %, and 49.6 %, respectively). The <em>CA-vSD</em> algorithm generated the smaller average misclassification rate (4.1 %), but with 37.0 % of cases for which no diagnosis was proposed. In contrast, the NCA algorithm left only 2.7 % of cases without an associated diagnosis but misclassified 14.0 %. Among correctly classified cases (83.3 % of total), 91.2 % had a single proposed diagnosis, 8.6 % had two possible diagnoses, and 0.2 % had three. We demonstrate that the proposed AI algorithms provide an acceptable level of accuracy for the diagnostic classification of B-CLPD patients and, in general, surpass other algorithms reported in the literature.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"192 \",\"pages\":\"Article 110194\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525005451\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525005451","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
Comparison between five pattern-based approaches for automated diagnostic classification of mature/peripheral B-cell neoplasms based on standardized EuroFlow flow cytometry immunophenotypic data
Flow cytometry immunophenotyping is critical for the diagnostic classification of mature/peripheral B-cell neoplasms/B-cell chronic lymphoproliferative disorders (B-CLPD). Quantitative driven classification approaches applied to multiparameter flow cytometry immunophenotypic data can be used to extract maximum information from a multidimensional space created by individual parameters (e.g., immunophenotypic markers), for highly accurate and automated classification of individual patient (sample) data. Here, we developed and compared five diagnostic classification algorithms, based on a large set of EuroFlow multicentric flow cytometry data files from a cohort 659 B-CLPD patients. These included automatic population separators based on Principal Component Analysis (PCA), Canonical Variate Analysis (CVA), Neighbourhood Component Analysis (NCA), Support Vector Machine algorithms (SVM) and a variant of the CA(Canonical Analysis) algorithm, in which the number of SDs (Standard Deviations) varied for each of the comparisons of different pairs of diseases (CA-vSD). All five classification approaches are based on direct prospective interrogation of individual B-CLPD patients against the EuroFlow flow cytometry B-CLPD database composed of tumor B-cells of 659 individual patients stained in an identical way and classified a priori by the World Health Organization (WHO) criteria into nine diagnostic categories. Each classification approach was evaluated in parallel in terms of accuracy (% properly classified cases), precision (multiple or single diagnosis/case) and coverage (% cases with a proposed diagnosis). Overall, average rates of correct diagnosis (for the nine B-CLPD diagnostic entities) of between 58.9 % and 90.6 % were obtained with the five algorithms, with variable percentages of cases being either misclassified (4.1 %–14.0 %) or unclassifiable (0.3 %–37.0 %). Automatic population separators based on CA, SVM and PCA showed a high average level of correctness (90.6 %, 86.8 %, and 86.0 %, respectively). Nevertheless, this was at the expense of proposing a considerable number of multiple diagnoses for a significant proportion of the test cases (54.5 %, 53.5 %, and 49.6 %, respectively). The CA-vSD algorithm generated the smaller average misclassification rate (4.1 %), but with 37.0 % of cases for which no diagnosis was proposed. In contrast, the NCA algorithm left only 2.7 % of cases without an associated diagnosis but misclassified 14.0 %. Among correctly classified cases (83.3 % of total), 91.2 % had a single proposed diagnosis, 8.6 % had two possible diagnoses, and 0.2 % had three. We demonstrate that the proposed AI algorithms provide an acceptable level of accuracy for the diagnostic classification of B-CLPD patients and, in general, surpass other algorithms reported in the literature.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.