{"title":"Leveraging Machine Learning for Rapid and Accurate Diagnosis of Acute Leukemia.","authors":"Beulah Priscilla Maddirala, Gurleen Oberoi, Anand Kakarla, Beena Chandrasekhar, Ajay Gupta, Reena Nakra, Vandana Lal","doi":"10.1111/ijlh.14555","DOIUrl":null,"url":null,"abstract":"<p><strong>Context: </strong>Early detection of acute leukemia (AL) is crucial for timely intervention and improved outcomes. Machine learning (ML) models provide a promising approach for early screening and rapid diagnosis of AL, minimizing delays in referral.</p><p><strong>Objectives: </strong>To assess the utility of leukocyte cell population data (CPD) through ML models for detecting AL. To subclassify AL into acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) using CPD morphometry at a pre-microscopic level. To perform feature analysis on the ML prediction model.</p><p><strong>Methods: </strong>We analyzed 1211 cases, including 810 confirmed AL cases (by morphology, immunophenotype, or molecular methods) and 401 benign cases. Leukocyte parameters and CPD from a Sysmex XN1000 analyzer (WDF Channel) were used for classification. ML models-LightGBM, CatBoost, TabNet, and XGBoost-were trained, and the optimal model was selected based on accuracy from 5-fold cross-validation. Feature contributions were evaluated using SHAP.</p><p><strong>Results: </strong>Heat maps and UMAP projections effectively differentiated AL from benign cases and AML from ALL. XGBoost achieved the best performance with 88% sensitivity and 94% specificity. ROC-AUC scores were 0.88 for AML, 0.87 for ALL, and 0.99 for benign. Key features identified included NE-WY, MO-WZ, LYMPH, NE-WZ, NEUT, and MONO#.</p><p><strong>Conclusion: </strong>ML models based on leukocyte and CPD parameters enhance the predictability of AL detection and lineage differentiation at a pre-microscopic level. Integrating these models into hematology analyzers provides a cost-effective, novel tool for detection and differentiation. Interpretable predictions assist experts, reducing subjectivity and expediting final diagnosis through immunophenotyping and molecular studies.</p>","PeriodicalId":94050,"journal":{"name":"International journal of laboratory hematology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of laboratory hematology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/ijlh.14555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Context: Early detection of acute leukemia (AL) is crucial for timely intervention and improved outcomes. Machine learning (ML) models provide a promising approach for early screening and rapid diagnosis of AL, minimizing delays in referral.
Objectives: To assess the utility of leukocyte cell population data (CPD) through ML models for detecting AL. To subclassify AL into acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) using CPD morphometry at a pre-microscopic level. To perform feature analysis on the ML prediction model.
Methods: We analyzed 1211 cases, including 810 confirmed AL cases (by morphology, immunophenotype, or molecular methods) and 401 benign cases. Leukocyte parameters and CPD from a Sysmex XN1000 analyzer (WDF Channel) were used for classification. ML models-LightGBM, CatBoost, TabNet, and XGBoost-were trained, and the optimal model was selected based on accuracy from 5-fold cross-validation. Feature contributions were evaluated using SHAP.
Results: Heat maps and UMAP projections effectively differentiated AL from benign cases and AML from ALL. XGBoost achieved the best performance with 88% sensitivity and 94% specificity. ROC-AUC scores were 0.88 for AML, 0.87 for ALL, and 0.99 for benign. Key features identified included NE-WY, MO-WZ, LYMPH, NE-WZ, NEUT, and MONO#.
Conclusion: ML models based on leukocyte and CPD parameters enhance the predictability of AL detection and lineage differentiation at a pre-microscopic level. Integrating these models into hematology analyzers provides a cost-effective, novel tool for detection and differentiation. Interpretable predictions assist experts, reducing subjectivity and expediting final diagnosis through immunophenotyping and molecular studies.