Leveraging Machine Learning for Rapid and Accurate Diagnosis of Acute Leukemia.

International journal of laboratory hematology Pub Date : 2025-09-09 DOI:10.1111/ijlh.14555

Beulah Priscilla Maddirala, Gurleen Oberoi, Anand Kakarla, Beena Chandrasekhar, Ajay Gupta, Reena Nakra, Vandana Lal

{"title":"Leveraging Machine Learning for Rapid and Accurate Diagnosis of Acute Leukemia.","authors":"Beulah Priscilla Maddirala, Gurleen Oberoi, Anand Kakarla, Beena Chandrasekhar, Ajay Gupta, Reena Nakra, Vandana Lal","doi":"10.1111/ijlh.14555","DOIUrl":null,"url":null,"abstract":"Context: Early detection of acute leukemia (AL) is crucial for timely intervention and improved outcomes. Machine learning (ML) models provide a promising approach for early screening and rapid diagnosis of AL, minimizing delays in referral.Objectives: To assess the utility of leukocyte cell population data (CPD) through ML models for detecting AL. To subclassify AL into acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) using CPD morphometry at a pre-microscopic level. To perform feature analysis on the ML prediction model.Methods: We analyzed 1211 cases, including 810 confirmed AL cases (by morphology, immunophenotype, or molecular methods) and 401 benign cases. Leukocyte parameters and CPD from a Sysmex XN1000 analyzer (WDF Channel) were used for classification. ML models-LightGBM, CatBoost, TabNet, and XGBoost-were trained, and the optimal model was selected based on accuracy from 5-fold cross-validation. Feature contributions were evaluated using SHAP.Results: Heat maps and UMAP projections effectively differentiated AL from benign cases and AML from ALL. XGBoost achieved the best performance with 88% sensitivity and 94% specificity. ROC-AUC scores were 0.88 for AML, 0.87 for ALL, and 0.99 for benign. Key features identified included NE-WY, MO-WZ, LYMPH, NE-WZ, NEUT, and MONO#.Conclusion: ML models based on leukocyte and CPD parameters enhance the predictability of AL detection and lineage differentiation at a pre-microscopic level. Integrating these models into hematology analyzers provides a cost-effective, novel tool for detection and differentiation. Interpretable predictions assist experts, reducing subjectivity and expediting final diagnosis through immunophenotyping and molecular studies.","PeriodicalId":94050,"journal":{"name":"International journal of laboratory hematology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of laboratory hematology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/ijlh.14555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Context: Early detection of acute leukemia (AL) is crucial for timely intervention and improved outcomes. Machine learning (ML) models provide a promising approach for early screening and rapid diagnosis of AL, minimizing delays in referral.

Objectives: To assess the utility of leukocyte cell population data (CPD) through ML models for detecting AL. To subclassify AL into acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) using CPD morphometry at a pre-microscopic level. To perform feature analysis on the ML prediction model.

Methods: We analyzed 1211 cases, including 810 confirmed AL cases (by morphology, immunophenotype, or molecular methods) and 401 benign cases. Leukocyte parameters and CPD from a Sysmex XN1000 analyzer (WDF Channel) were used for classification. ML models-LightGBM, CatBoost, TabNet, and XGBoost-were trained, and the optimal model was selected based on accuracy from 5-fold cross-validation. Feature contributions were evaluated using SHAP.

Results: Heat maps and UMAP projections effectively differentiated AL from benign cases and AML from ALL. XGBoost achieved the best performance with 88% sensitivity and 94% specificity. ROC-AUC scores were 0.88 for AML, 0.87 for ALL, and 0.99 for benign. Key features identified included NE-WY, MO-WZ, LYMPH, NE-WZ, NEUT, and MONO#.

Conclusion: ML models based on leukocyte and CPD parameters enhance the predictability of AL detection and lineage differentiation at a pre-microscopic level. Integrating these models into hematology analyzers provides a cost-effective, novel tool for detection and differentiation. Interpretable predictions assist experts, reducing subjectivity and expediting final diagnosis through immunophenotyping and molecular studies.

查看原文本刊更多论文

利用机器学习快速准确地诊断急性白血病。

背景：早期发现急性白血病（AL）对于及时干预和改善预后至关重要。机器学习（ML）模型为早期筛查和快速诊断AL提供了一种有前途的方法，最大限度地减少了转诊延误。目的：通过ML模型评估白细胞细胞群数据（CPD）在检测AL中的效用。利用CPD形态学在显微镜前水平将AL亚分类为急性髓性白血病（AML）和急性淋巴细胞白血病（ALL）。对机器学习预测模型进行特征分析。方法：对1211例AL病例进行分析，其中经形态学、免疫表型或分子方法确诊的AL病例810例，良性病例401例。使用Sysmex XN1000分析仪（WDF Channel）的白细胞参数和CPD进行分类。对ML模型lightgbm、CatBoost、TabNet和xgboost进行训练，并根据5次交叉验证的准确性选择最优模型。使用SHAP对特征贡献进行评估。结果：热图和UMAP投影能有效区分AL与良性病例、AML与ALL。XGBoost的灵敏度为88%，特异度为94%。AML的ROC-AUC评分为0.88，ALL为0.87，良性为0.99。确定的主要特征包括NE-WY， MO-WZ， LYMPH, NE-WZ， NEUT和MONO#。结论：基于白细胞和CPD参数的ML模型在显微镜前水平增强了AL检测和谱系分化的可预测性。将这些模型集成到血液学分析仪中，为检测和区分提供了一种具有成本效益的新型工具。可解释的预测有助于专家，减少主观性，并通过免疫表型和分子研究加快最终诊断。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International journal of laboratory hematology

自引率

0.00%

发文量